Architectural challenges in designing applications for augmented reality glasses
Development of software for augmented reality glasses computer devices requires a new decisions in the field of user experience and extensive knowledge of the technical capabilities of this type of device. In this extremely dynamically changing field, it is necessary to select basic classes of devices and formulate an effective design method for these devices of the appropriate application architecture. In the article we present both the classification of glasses and the method of making design decisions leading to the successful creation of solutions of this type.
Below there is a classification of devices belonging to the group of smart glasses, computer equipment worn on the user's body. The classification created by the author is based on differences in the scope of: possibilities of data processing power as independent computers, having the function of analyzing the position of the eyepiece in space (SLAM), the ability to display and position 3D objects in space, owning or not own computer.
This allowed to create 4 main groups of devices:
1. Eyewear data stream Displays: AR Displays
2. Augmented Reality Eyewear Computers: AR Glasses
3. Mixed Reality Eyewear Computers: Mixed Reality Glasses
4. Eyewear Mixed Reality Displays: Mixed Reality Displays
In practice, each of these groups of devices requires different software design decisions. Below is a description of the device classes given.
Eyewear data stream displays: AR Displays
Devices belonging to this group are the most basic solution for presenting information to the user's eyes. They do not have their own computer with an operating system but only receive data packets from external devices.
The model eyepiece of this type is Epson BT-35E. This device connects via a HDMI port to video data streams. The displays present a binocular clean image in front of the user's eyes in a relatively high resolution. It is not possible to control this image - instead, you can send simple signals to the image transmitter type: Play / Stop. The eyepiece may have an IMU in order to, for example, rotate in a spherical film but may not perform operations other than just passive display.
Data stream displays can always be used where there is nothing more than just passing video stream from cameras and monitoring systems to the user's eyes.
Augmented Reality Eyewear Computers: AR Glasses
The model of Augmented Reality eyepiece should meet the following requirements:
1. Own computer and own operating system
2. Display with FOV (field of view) covering at least 14 degrees of field of view
3. 9-axis IMU (accelerometer, compass, gyroscope)
4. Additional sensors: intensity of the surrounding light, pressure sensor, others
5. Vision camera
6. Ability to communicate via Bluetooth, wifi, USB
8. Audio speakers
The technical parameters of these components differ from one model to the next. It is assumed that the eyepiece belonging to this group will correctly display pictograms, inscriptions, pictures, video but it is not intended to generate complex 3D objects that could be rotated in space.
The fields of view of Augmented Reality glasses are 9 to 23 degrees and the dominant resolution is 480px. Almost all AR glasses are monocular. This is probably due to the need to ensure energy efficiency with the generally acceptable human use of one eye. The light stream in these glasses is emitted from the side of the eyepiece. This does not allow to achieve really large AR vision because in the currently created optical systems only streaming from the top allows it (it is visible in MR glasses).
These glasses have basic control options usually based on a touchpad in the eyepiece itself or external devices: a touchpad pinned on a cable, a Bluetooth joystick. Some of them have recognition of simple voice commands. An important element of their operation is a Companion application installed on a smartphone from which you can manage your eyepiece or use it as a virtual touchpad (Vuzix Blade).
Designers try to make these glasses not different from classic glasses (this type of effect was achieved in North Focals and Vuzix Blade glasses). If the equipment does not pretend to be ordinary glasses because it is, for example, an external display mounted on an ordinary eyepiece, then the designers' efforts are aimed at creating an elegant and aesthetic device. Google Glass 2.0 is a great example.
In terms of the number and type of sensors, these glasses do not have the function of SLAM (Simultaneous Localization and Mapping) and thus environmental analytics. These devices orient themselves in space using the IMU: accelerometer, gyroscope, compass + GPS. It is possible to enrich this orientation using real-time image analysis. Working time on these devices rarely exceeds 3 hours (on own battery). In the case of North Focals, the working time actually includes 18 hours, but it is paid for by a much smaller display and frequent switching to standby mode.
The group of basic devices of this type includes:
Vuzix Blade, North Focals, Epson Moverio BT300 and BT350, Google Glass 2.0, Madgaze X5.
Importantly, the preparation of more complex solutions for these glasses usually requires processing in a mobile device of the smartphone type or direct communication with efficient API servers serving their services. The eyepiece should receive processed data so that the logic performed on it is maximally reduced.
There is a subgroup of AR glasses dedicated to the industries. It includes the Vuzix M300XL, Vuzix M400, VuzixM4000 and RealWare HMT1 models. In the case of industrial AR glasses, application architecture is mainly based on processing on the eyepiece itself, without the need for additional processing on the smartphone. The AR glasses are becoming a full-size mobile device that is expected to be processed like on an efficient tablet.
As a result of all these changes, AR Glasses are able to display e.g. 3D solids located in a marked space. This solid can be rotated but it is not possible to move around it in space.
The AR eyepiece has enough computing power to handle complex voice commands and analytics of user gestures performed in front of the eyepiece camera. This type of equipment will be a good device for industrial applications because it is able to perform complex data processing without requiring anything more than connecting to backend systems via wifi. In this type of device the field size outside the AR area is very important. Due to the risk of an accident, this space must be as unobstructed as possible.
Mixed Reality Eyewear Computers: Mixed Reality Glasses
A group of devices that are somewhat aware of their surroundings, i.e. powerful computers worn on their heads, analyzing in real time the place where the user was (SLAM).
The MR eyepiece has several cameras and sensors enabling it to accomplish this task.
The MR eyepiece has all the features of an AR eyepiece but also adds:
2. Additional cameras and sensors
3. Additional control methods (gesture, head movements, eye movements)
Similarly to the previous group, processing this type of information raises requirements for computing power and energy consumption. The group of these glasses includes two models: Microsoft Hololens 2 weighing 566 grams and relatively light because weighing 180 grams Thirdeye MR X2.
Microsoft Hololens 2 has the richest on the market the ability to analyze the environment, can track the movements of the user's eyeballs, recognize gestures, sounds. It is a device designed for very demanding users. Thirdeye slightly subsides in some areas, adding a rather unique way of controlling the eyepiece by moving the head.
MR glasses aim to cover reality a lot, which is why they have wide fields of view, clear and colorful 3D objects. On the one hand, it is a sensational complement to the real world, and on the other, it is a serious disturbance of reception of the real world and its significant obscuring.
The surface observed outside the AR view in the Thirdeye eyepiece is small, it takes no more than 100 degrees from the conventional 180-degree viewing angle of a person without glasses. Therefore, business applications for MR glasses must consider a significant level of information reduction in user orientation in space and address the risk of an accident. Hololens 2 glasses are open from the side, which allows the least to obscure the real world and reduce bumps and trips.
The lifetime of glasses with such capabilities is limited to 2-3 hours on one battery pack. The batteries can be replaced or the eyepiece must be connected to a power bank.
The MR eyepiece completes reality with complex 3D objects that can be rotated and viewed in any way. It is characteristic for these devices that the 3D object is hung in a certain place in space and you can walk around and transform it as expected.
Eyewear Mixed Reality Displays: Mixed Reality Displays
The intensive research and development works of two great investors, i.e. MagicLeap and Microsoft, inspired other companies to create surprisingly effective solutions in a different architecture. Similar to the companies mentioned above, the goal was to achieve a large Field of View ready to handle complex 3D objects - while having the SLAM function. However, the location of the computer was solved differently - by taking it out of the eyepiece block outside. A new branch of glasses was created that does not have its own computer consuming calculations performed by other units.
These devices use a mechanism of light streaming from above, using a set of small mirrors (in a similar form as it is in Hololens). Thanks to this, a Field of View of 42-52 degrees was obtained at 720pix resolutions, which is a really sensational result. This was not without compromise. The optics of these glasses have characteristically obstructed upper parts of the field of view, which means that the user must unnaturally lift his head in order to see objects above him. The MR display is relatively light but stands out from the head. It is impossible to hide the large size of the optical system. The computer comes either from a smartphone or from a dedicated computer device, exactly as Epson solved in the models Moverio BT300, BT350.
Separation of processing into two devices is a step that may prove to be the key from the point of view of the low weight of such a eyepiece with relatively low nuisance of keeping a strong laptop or smartphone with you. One should not forget about another important addition to such a set is a rich power bank.
Examples of devices of this type are Nreal, Madgaze Glow and 0Glasses Real-X.
AR Glasses software architecture design approach
We will now carry out a decision-making process leading to key architectural decisions in the development of software applications. Below we describe how architectural decisions were made for the VEO Navigation application: land-sea navigation, online / offline supporting vehicles: yacht, kayak, cars, motorbikes, bicycles, walk, skier. The presentation of navigation data is carried out on several glasses belonging to our group of AR glasses but it is possible to run the application on any eyepiece with the Android 6.0+ platform having 9-axis IMU and Bluetooth.
Must requirements that had to be handled were:
1. Work online / offline - this forces the existence of large data sets on devices that support navigation.
2. Marine mode and land mode - this forces different styles of navigation on land and on water, different units of measurement, different dynamics of changes in the application during navigation, integration with on-board devices or lack thereof
3. Data display on the AR eyepiece in the form of a 3D sphere with objects superimposed on the real world - the assumption of the project was to display data on the AR eyepiece, the decision was made in 2017, when there was no eyeglass on the market yet it was to be sure that it would do the job,
4. Navigation control should take place on the phone, due to the high complexity of operations that are performed when setting navigation, downloading offline files, searching for objects etc. It has been found that the simplified UI interfaces of the glasses will not be able to configure the navigation task efficiently and correctly read all route data. With the emergence of glasses capable of complex control of objects and data, this requirement has been moved to ordinary (not mandatory) requirements.
5. We cannot be dependent on global map and data providers in these Google maps due to licensing restrictions, price list volatility, architectural enforcement - it was necessary to base the architecture on publicly available data from OpenStreetMaps and own data.
Step 1: Technical and business feasibility. It consists in defining a list of key requirements, i.e. requirements that are not negotiable, and on this basis feasibility studies are conducted based on the technical capabilities of the devices on the market. On this basis, devices that are unable to perform specific tasks are rejected or the equipment is too expensive for our model user. In extreme cases, no eyepiece can meet the requirements, which will stop the project completely. This stage is carried out over several analytical cycles between Product Manager and Solution Architect. Iterations are carried out until the Product Manager is able to provide a valuable software product and the Architect finds feasibility on at least several devices at a price acceptable by the customer.
Step 2: Design system components. It consists in proposing the optimal structure of components with the allocation of the type and scope of processing on a particular one. Obtaining knowledge to make such decisions is problematic and requires a number of studies. As a result, we get layers of the solution that are processed on individual devices as part of various components.
Step 3: Technology selection. It consists in the selection of technology and appropriate ready components on individual layers of the solution. Factors such as the functional scope of existing development environments and finished software components, the cost of software production, and the flexibility of future software in terms of changes are decisive.
Step 4: Integration methods. It consists in choosing integration methods between layers. We choose from RESTAPI methods, GraphQL API, class sharing between technologies, methods based on file transfer.
Step 5: Ensure flexibility of the solution. In this aspect, we assess how individual system components are able to switch to other technologies or devices - e.g. conversion of Android to IOS in a mobile device or replacement of one eyepiece for another. We try to correct the created architecture so as not to close the path of future development
It is worth adding that the full decision-making process regarding the shape of the solution's architecture may take many months. It strongly depends on the devices available on the market and can last "in the background" throughout the entire 18-month project period. In fact, the creation of IT architecture never ends because on the dynamic IT market there are often smaller or larger technological breakthroughs opening new perspectives for created solutions. This approach is therefore iterative and we present it in Figure .
Figure 1: Approach to architecture design in an AR glasses project
Both the method developed and the decision-making process carried out in the project were assessed as optimal in terms of quality in the market conditions of this project. Measuring the quality of this process is possible and can be measured by the number of code refractors and the number of components listed. In the analyzed case, about 10% of the budget was consumed for the software refactor of a given component. This refactor twice resulted from an error as to the selected component (Graphhopper-Valhalla in the mobile application and once from the change of the authorization module in Backend). Several times with smaller errors in the approach taken by programmers. About 5% of the budget was consumed to remove the component and embed another. Some of these errors were due to gaps in knowledge during planning, and some were deliberate efforts to discover certain phenomena, and the effects of this research can be written off. However, losses were rare. The standard effect of the research was to use the initial worked out code for further stages of work.
Was it possible to avoid these excessive costs? If we had assumed that at least one full project of this type would have been carried out earlier, the answer is yes. The knowledge acquired during an exhaustive project in this field can be used to further minimize errors and reduce the delivery time of a similar project..
Vuzix website https://www.vuzix.com/
Epson website https://www.epson.com/
Microsoft website https://www.microsoft.com/
Madgaze website https://www.madgaze.com/
Thirdeye website https://www.thirdeyegen.com/
Google glasses website https://www.google.com/glass/start/
North glasses website https://www.bynorth.com/