Tag Archives: ar

Apple Vision Pro Demo Impressions

I tried out the Apple Vision Pro (AVP) hardware in an Apple store today. The ball was in Apple’s court. I really wanted to be impressed by the hardware and put me over the edge to pick one up and develop apps for it. I’ve released Day 1 apps for the Apple Watch & ARKit (iOS 11), and I believe in the future of AR for productivity.

Not Sharp

Unfortunately, when I wore the AVP, the content (text, images, etc.) was not razor sharp. I could use the device and navigate the OS without an issue, but I was expecting next-gen sharpness on the AVP displays.

My guess is I could try different distances (closer or further) from the screen to find the distance where all the content is sharp and crisp. Probably try different light seals. I couldn’t achieve the level of sharpness that I expect from any 2020 device (phone, HD monitor, etc).

Also, there was an opening at the bottom of my headset (towards my nose). I thought the light seals were supposed to block 360 degrees of light around the headset and not leave a small gap. That gap was normal per the Apple rep.

Demo

Apple did a great job with the demo. The demo was seated (smart) and focused on VR content, not passthrough use cases.

The OS (visionOS) was simple to use. Pinching to drag around or press buttons worked fine with hand gestures. When I tried to resize windows (bottom corners) or use two hands to pull apart, I ran into problems with certain apps where it wouldn’t work.

Content (2D vs 3D)

My imagined ideal use case would be having several large macOS screens in front of me to get work done. However, Apple marketing seems to be focused on entertainment (big TV in front of you) as their selling point.

The problem (in my opinion) is that the content was not great. Spatial content, shot on what I presumed are iPhone 15 Pro Maxes or AVPs, seemed low resolution to me. Enlarging an iPhone photo to fill up your entire room’s wall doesn’t work that well. It lacked detail. Even viewing a Panorama (shot on iPhone? not sure), the resolution was not great when viewed at such a large size.

Part of the demos included immersive environments. The environments were impressive since they were built natively for the device and 3D rendered. Viewing a photo from the moon environment was great since the nearby 3D rocks on the ground really sell the illusion.

I personally felt the other content fell apart. Spatial videos (shot with iPhone 15 PM?) were fun but it didn’t feel immersive to me since moving around lacked the convincing parallax experience you get from viewing things in your every day life.

While internet AVP users seem to enjoy viewing 2D movies on a giant, virtual screen, I think there is a huge opportunity for companies to build 3D immersive environments or games for users to be in (and interact with). Using the AVP’s state of the art hardware to view 2D images is like watching television without sound – a missed opportunity.

Takeaway

Despite the hardware issues (I suspect the light seal), I’d be interested in making AR apps for the AVP. Paying almost $4K to buy a dev kit and develop for Apple is a tough sell for an indie developer. I honestly think Apple should have a program for developers to borrow AVPs and build apps.

Intro to Computer Vision

I’m new to computer vision and a lot of the basic concepts are very interesting. As an iOS developer, my interests comes from using CoreML & Apple’s Vision in apps to improve the user experience.

Two common tasks are classification and object detection. Classification allows you to detect dominant objects present in an image. For example, classification can tell you that photo is probably of a car.

Object detection is much more difficult since it not only recognizes what objects are present, but also detects where they are in the image. This means that object detection can tell you that there is probably a car within these bounds of the image.

What’s important is that the machine learning model runs in an acceptable amount of time. Either asynchronous in the background or in real time. Apple provides a listing of sample models for classification at https://developer.apple.com/machine-learning/.

For real time object detection, TinyYOLO is an option, even if the frame rate is not near 60 fps today. Other real time detection models like YOLO or R-CNN are not going to provide a sufficient experience on mobile devices today.

One other interesting thing I came across is the PASCAL Visual Object Classes (VOC). These are common objects used for benchmarking object classification.

For 2012, the twenty object classes that have been selected were:

  • Person: person Animal: bird, cat, cow, dog, horse, sheep
  • Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
  • Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

These are common objects used to train classification models.

Computer vision used with machine learning has a tremendous amount of potential. Whether used with AR or other use cases, they can provide a compelling user experience beyond Not Hotdog.

ARKit Impressions

I’ve been working with ARKit recently. I am planning on releasing an AR basketball game when iOS 11 is released.

Here are misc thoughts about working with ARKit:

  • It’s hard to find answers to common questions about doing simple things in ARKit. Searching for SceneKit yields slightly more results, but even that is sparse. The Apple developer SceneKit & ARKit forums don’t appear to have much activity either. So it’s up to StackOverflow & random Internet blog posts
  • Working with ARKit means working with SceneKit. SceneKit is Apple’s framework to make working with 3D assets easier for developers. Working with SceneKit & 3D is something that I’m new to. A lot of the math around position, orientation, euler angles, transforms, etc. can get complex fast when it involves matrix transforms and quaternions.
  • It’s really hard to find assets for DAE/collada. The DAE format is meant to be interchange format for various 3D software to communicate with each other. The reality is that exporting to DAE or converting from one format to DAE is a crapshoot. I’ve used Blender briefly to look at 3d assets, but digging into 3D modeling is a huge time sink for some one looking to get involved in ARKit. I wish there was an online store that focused on selling low poly (<10K), DAE files.
  • Related, working with 3D assets as someone new to 3D assets is very frustrating. The concept of bounds vs scaling as they relate to importing into your SceneKit scene was very challenging (with the 3D model that I imported). If you have your own in-house or contracted 3D modeler, you should get 3D assets that work well with SceneKit, but I had countless issues with off the shelf 3D models & file formats.
  • After you’re able to import your 3D model, modeling the physics geometry can be a challenge. SceneKit allows you to import the geometry for your physics body as-is using ConcavePolyhedron, but you probably don’t want that. I had to manually recreate a basketball hoop using multiple shapes combined into a single SCNNode.
  • ARKit is not all powerful. The main feature that ARKit gives you is horizontal plane detection. Occlusion doesn’t come with ARKit. Expect many apps that deliver an experience reliant on a plane/surface like your desk or the floor.

ARKit is exciting, but don’t expect the world yet. Future ARKit releases & better iOS hardware should provide more compelling experiences. Today, you can expect to play with 3D models on a surface (with surface interaction) or in the air (with limited or no environment interaction).