Viewpoint
Abstract
The concept of reality-virtuality (RV) continuum was introduced by Paul Milgram and Fumio Kishino in 1994. It describes a spectrum that ranges from a purely physical reality (the real world) to a purely virtual reality (a completely computer-generated environment), with various degrees of mixed reality in between. This continuum is “realized” by different types of displays to encompass different levels of immersion and interaction, allowing for the classification of different types of environments and experiences. What is often overlooked in this concept is the act of diminishing real objects (or persons, animals, etc) from the reality, that is, a diminution, rather than augmenting it, that is, an augmentation. Hence, we want to propose in this contribution an update or modification of the RV continuum where the diminished reality aspect is more prominent. We hope this will help users, especially those who are new to the field, to get a better understanding of the entire extended reality (XR) topic, as well as assist in the decision-making for hardware (devices) and software or algorithms that are needed for new diminished reality applications. However, we also propose another, more sophisticated directional and layer-based taxonomy for the RV continuum that we believe goes beyond the mediated and multimediated realities. Furthermore, we initiate the question of whether the RV continuum truly ends on one side with physical reality.
JMIR XR Spatial Comput 2024;1:e52904doi:10.2196/52904
Keywords
Introduction
The reality-virtuality (RV) continuum is a concept introduced by Paul Milgram and Fumio Kishino [
] in 1994. It describes a spectrum that ranges from a purely physical reality (the real world) to a purely virtual reality (VR; a completely computer-generated environment), with various degrees of mixed reality (MR) in between. This continuum is “realized” by different types of displays [ ] to encompass different levels of immersion and interaction, allowing for the classification of different types of environments and experiences. The RV continuum helps us understand the varying levels of immersion and interactivity that technology can provide. As technology advances, the boundaries between these immersion levels can become more fluid, and new hybrid experiences can emerge. The continuum is particularly relevant in fields such as VR, augmented reality (AR), and MR, where researchers and developers aim to create more compelling and natural experiences that bridge the gap between the physical and virtual worlds. We used ChatGPT (OpenAI) [ ] to gauge the current state of the RV continuum. According to ChatGPT, the continuum is often divided into several main categories (note, we adapted the ChatGPT results and enhanced it with concrete examples, where necessary; [ ]). The original ChatGPT transcript is shown in [ ].ChatGPT prompt:
- What is the reality-virtuality continuum?
Main categories (modified ChatGPT output):
- Physical reality (real environment): This is the state of the unmediated physical world, where all sensory perceptions are naturally experienced without any technological augmentation.
- Augmented reality (AR): In this category, virtual elements are overlaid onto the real world. AR enhances the user’s perception of the physical world by adding computer-generated visual, auditory, or haptic information. Examples include smartphone apps that display digital information on top of real-world views, such as Pokémon GO [ ].
- Mixed reality (MR): MR environments combine virtual and physical elements in a way that allows them to interact in real time. Users can manipulate both virtual and real objects, and the distinction between the 2 can be blurred. Microsoft’s HoloLens is an example of an MR device that enables users to interact with holographic objects in their real-world environment [ ].
- Virtual reality (VR): In VR, users are completely immersed in a computer-generated environment that can simulate various sensory experiences. VR typically involves the use of head-mounted displays and other input devices to provide a sense of presence within the virtual environment [ ]. Prominent examples are the HTC Vive and the Meta Quest.
- Augmented virtuality (AV): This term is less commonly used than the others. It refers to scenarios where real-world elements are brought into a virtual environment. For example, capturing real objects or people and placing them into a virtual space. The Varjo XR-3 is capable of providing such a function and is able to make a video stream into the virtual world (VR). A concrete example could be showing a video stream of the (real) smartphone in VR, so the user can answer a text message without actually leaving VR (removing the headset and thus breaking the illusion being in “another world,” the simulated virtual environment).
Diminished Reality
What is often overlooked in this concept is the act of diminishing real objects (or persons, animals, etc) from reality, rather than augmenting the reality with virtual things [
, ]. An introduction to the topic can be found in Cheng et al [ ]. A reason for this is that diminishing something from reality needs, in general, a sophisticated understanding of the real scene or environment to make the diminishing aspect convincing. In AR, the real world is just overwritten with a virtual object. In diminished reality (DR), however, the real-world part that is augmented or diminished needs to seemingly fit to the reality around it. In addition, this should all be performed in real time when a user is walking around the real world, and an algorithm has to do the following (note that the first 3 items are part of the Extent of World Knowledge axis of the taxonomy by Milgram and Kishino [ ]):- Detect and track the real object that has to be removed or diminished;
- Perform geometric modeling of the scene and objects to be added or subtracted (preexisting or captured once or in real time);
- Apply the lighting model of the scene to objects added or to part of the revealed scene when something is removed (preexisting or captured once or in real time); and then
- Combine all the previous points together as the scene description for the rendering algorithm.
All of this has to be done not only in real time but also with very high precision. The inserted virtual object has to fit seamlessly into and make sense with the reality; minor discrepancies will appear to be a glitch and will be noticed immediately by the user, as we recently observed in a DR user study [
]. In fact, we think that diminution and augmentation require fundamentally different technologies. In our opinion, an augmentation may be needed to alter reality at a certain position with regard to other (real) objects (eg, displaying a patient’s tumor as an AR hologram on the patient in front of you, at the real position, such as for needle guidance [ ]), but no seamless and semantic fitting is necessary. As soon as a virtual object needs to fit into the scene semantically, we consider this to require diminution. Hence, for augmentation, you only need a volume rendering process with some basic options, such as position, size, and transparency. For diminution, however, additional fundamentally different technologies are needed. The scene has to be analyzed and understood, and a meaningful replacement has to be generated and inserted as an AR hologram. An example could be glasses that are removed from a person in front of you.In summary, the user has to get the impression that the real, diminished object does not exist at all in reality [
]. Besides sophisticated algorithms, this course of action needs a considerable amount of computing power. Fortunately, there has been tremendous progress in both areas during the last years, with deep learning–based approaches and GPUs that can run these kinds of algorithms, even in real time. As a result, DR has already found its way into some applications [ ], such as virtual furniture removal for redecorating purposes (eg, IKEA Kreativ [ ]). Other possible applications for DR include the following:- Privacy enhancing: In a live video feed, certain objects or information can be blurred or removed in real time to protect sensitive or private data.
- Training and education: DR can be used to remove distractions in a learning environment or highlight specific items to focus on.
- Therapeutic applications: For someone with a phobia of spiders, a DR system could recognize spiders in the person’s field of view and diminish or replace them with less threatening images to reduce anxiety. Additionally, sensory overload, a feature of autism, could be diminished with a DR system, to reduce overstimulation.
Directional and Layer-Based Taxonomy
Nevertheless, for all these aforementioned reasons, we think that DR needs to be more prominent on the RV continuum, as shown in
[ ], without delving deeper into the broad topics of mediated reality [ ] or even multimediated reality [ ]. This will not only assist in the decision-making for hardware (devices) and software that are needed for new DR applications but also help unfamiliar users to get a better understanding of the entire extended reality (XR) topic (note that we are addressing this revision to the continuum purely from an application or user point of view [POV], not from the POV of an MR researcher or engineer). An example application for DR could be the real-time anonymization of a face via XR. There is a huge difference if a device detects the eye area and simply inpaints a black bar over the eyes (without considering the surrounding facial area) or inpaints the eyes with different or meaningful ones that fit perfectly to the surrounding facial area. The black bar approach can probably be performed on a current smartphone, whereas the second approach needs much more sophisticated hardware and computing power, with an integrated GPU that can run a trained, deep inpainting neural network in real time (note that a user with an XR headset would move around in general, which also changes the POV on the face to be anonymized, so the inpainting algorithms also has to be executed continuously in real time). In this context, we also think that the upcoming Apple Vision Pro will push the limits in DR, because it is a video-see-through device that can enable DR to reach its full potential [ ]. In fact, the Digital Crown hardware of the Apple Vision Pro, which also exists for the Apple Watch, should enable us to seamlessly walk along the whole RV continuum (back and forth) and bring medical DR applications to reality, which are still almost nonexistent currently [ ]. A potential example of the photo-editing capabilities of newer cell phones as a diminution operation is shown in [ ]. In this medical example, DR enables the removal of a skin tumor virtually from a patient’s face before surgery.Diminished virtuality (DV), however, remains a curiosity, and not even ChatGPT—which has been fed with 300 billion words from books, web texts, Wikipedia, articles, etc [
]—can handle this term as of January 2024 ( ; the original ChatGPT transcript is shown in [ ]).ChatGPT prompt:
- What is “Diminished Virtuality”?
ChatGPT output:
- “Diminished Virtuality” is not a widely recognized term or concept as of my last knowledge update in January 2022. It’s possible that this term has been coined or introduced in a specific context or field after that date, or it may be a term used in a niche or specialized area that is not widely known.
- If you have more context or information about where you came across this term or its usage, I would be happy to try to provide more information or insight based on that context.
DV can perhaps be best imagined as starting on the extreme right side on the RV continuum, which contains only virtual objects (virtual environment and VR), and going along the continuum toward the left to reach diminished virtual elements (see the figure of the RV continuum by Iriqat and Vatansever [
], and note that this is especially of interest for computer-generated structures that are very complex and where only some parts need to be diminished). This should be possible with the Digital Crown hardware of the Apple Vision Pro: when fully immersed, the virtuality is reduced to make the real world partly visible. Apple describes scenarios where people might want to increase the pass-through when they intend to interact with a nearby physical object or read text on another device. Another real-world scenario could be a so-called tumor board meeting that is held in VR, where several experts from around the world gather to discuss a rare case and certain virtual elements (images, texts, etc) are diminished because of privacy issues (note that the mixture of reality here is also modifiable): an externally tuned-in expert (who is traveling) can see certain patient information because he or she belongs to the clinic where the patient is located, whereas another externally tuned-in expert who does not belong to the clinic of the patient should not see or hear certain patient information. However, the full potential of DV would unfold when the virtuality is diminished in a way that also fits with the upcoming reality. An example would be a real-world person showing up in VR between virtual objects. Imagine layers of virtual and real context stacked up but still seamlessly interplay with each other for the final output. In fact, this layer-based concept could also extend to mediated and even multimediated realities, where, for example, different sounds (real and virtual) interplay with each other, depending on their location and depth. “Enaudio” (the “hearing” equivalent to envision) hearing real rain outside in the real world that are mixed with virtual sounds while inside a virtual room. Real and virtual input from different senses could also be mixed; for example, there is a (virtual) hole in the VR room and the (real) rain falls inside this virtual room, generating simulated sounds as they hit virtual objects in the VR room. The same also works for outputs and their combinations; for example, the real voice from someone wearing the headset is mixed with virtual sounds, or real smells are mixed into VR by generating the corresponding virtual objects there. Finally, some could argue whether the RV continuum really ends on one side with physical reality [ ], because the brain creates reality—comparable to a head-mounted display creating VR—but does not detect it. This could be discussed and explored by the community in the future, and in this regard, we want to propose a diagram of XR that loops back (as opposed to a continuum on a straight line with 2 ends) as an inspiration ( ). In summary, XR is a niche yet growing topic, because more and more consumer devices with better hardware have become available during the last few years. In parallel, artificial intelligence (AI)–based algorithms have made 1 breakthrough after another, which can, for example, be explored to develop novel applications for an intelligent health care metaverse [ ].Acknowledgments
This work was supported by TU Graz Open Access Publishing Fund. BP was funded by the Medical Faculty of the RWTH Aachen University in Germany as part of the Clinician Scientist Program. CG was funded by the Advanced Research Opportunities Program (AROP) from the RWTH Aachen University. Furthermore, we acknowledge the Center for Virtual and Extended Reality in Medicine (Zentrum für virtuelle und erweiterte Realität in der Medizin; ZvRM [
]) of the University Hospital in Essen, Germany. We used the generative AI tool ChatGPT by OpenAI to identify the main categories of the reality-virtuality continuum and the status of diminished virtuality, which were further review and revised by the study group. The original ChatGPT transcripts are made available as and . Parts of and were created using the generative AI tool DALL·E 3 by OpenAI [ ].Conflicts of Interest
BP is an associate editor of the Journal of Medical Internet Research. The other authors have no conflicts of interest to declare.
Asking ChatGPT about "the reality-virtuality continuum."
PNG File , 303 KBAsking ChatGPT about “diminished virtuality.”
PNG File , 212 KBReferences
- Milgram P, Kishino F. A taxonomy of mixed reality visual displays. IEICE Transactions on Information and Systems. Dec 1994;E77-D(12):1321-1329. [FREE Full text]
- Milgram P, Takemura H, Utsumi A, Kishino F. Augmented reality: a class of displays on the reality-virtuality continuum. In: SPIE Proceedings, Volume 2351, Telemanipulator and Telepresence Technologies. 1995. Presented at: Photonics for Industrial Applications 1994; October 31 to November 4, 1994;282-292; Boston, MA. [CrossRef]
- ChatGPT. OpenAI. URL: https://chat.openai.com/ [accessed 2024-01-23]
- Kim JK, Chua M, Rickard M, Lorenzo A. ChatGPT and large language model (LLM) chatbots: the current state of acceptability and a proposal for guidelines on utilization in academic medicine. J Pediatr Urol. Oct 2023;19(5):598-604. [CrossRef] [Medline]
- Lawler-Sagarin KA, Sagarin BJ, Pederson A. Enhanced community through augmented reality: social benefits of Pokémon Go. Psychol Rep. Aug 22, 2023.:332941231197155. [CrossRef] [Medline]
- Gsaxner C, Li J, Pepe A, Jin Y, Kleesiek J, Schmalstieg D, et al. The HoloLens in medicine: a systematic review and taxonomy. Med Image Anal. Apr 2023;85:102757. [FREE Full text] [CrossRef] [Medline]
- Gruber LJ, Egger J, Bönsch A, Kraeima J, Ulbrich M, van den Bosch V, et al. Accuracy and precision of mandible segmentation and its clinical implications: virtual reality, desktop screen and artificial intelligence. Expert Syst Appl. Apr 2024;239:122275. [CrossRef]
- Mori S, Ikeda S, Saito H. A survey of diminished reality: techniques for visually concealing, eliminating, and seeing through real objects. IPSJ Transactions on Computer Vision and Applications. Jun 28, 2017;9:17. [CrossRef]
- Mann S. Mediated reality with implementations for everyday life. Presence Connect. Aug 6, 2002. [FREE Full text]
- Cheng YF, Yin H, Yan Y, Gugenheimer J, Lindlbauer D. Towards understanding diminished reality. 2022. Presented at: CHI '22: CHI Conference on Human Factors in Computing Systems; April 29 to May 5, 2022;1-16; New Orleans, LA. [CrossRef]
- Gsaxner C, Mori S, Schmalstieg D, Egger J, Paar G, Bailer W, et al. DeepDR: deep structure-aware RGB-D inpainting for diminished reality. arXiv. Preprint posted online on December 1, 2023. [CrossRef]
- Gsaxner C, Li J, Pepe A, Schmalstieg D, Egger J. Inside-out instrument tracking for surgical navigation in augmented reality. 2021. Presented at: VRST '21: 27th ACM Symposium on Virtual Reality Software and Technology; December 8-10, 2021;1-11; Osaka, Japan. [CrossRef]
- Schmidt S. Blended spaces: perception and interaction in projection-based spatial augmented reality environments [dissertation]. University of Hamburg. 2020. URL: https://ediss.sub.uni-hamburg.de/bitstream/ediss/8644/1/dissertation.pdf [accessed 2024-01-22]
- Liang YW, Huang YH. Exploration of user experience in mixed reality for product virtual interaction and display. 2023. Presented at: 2023 IEEE 6th International Conference on Knowledge Innovation and Invention (ICKII); August 11-13, 2023;404-409; Sapporo, Japan. [CrossRef]
- DALL·E 3. OpenAI. URL: https://openai.com/dall-e-3 [accessed 2024-01-25]
- Mann S, Furness T, Yuan Y, Iorio J, Wang Z. All reality: virtual, augmented, mixed (X), mediated (X,Y), and multimediated reality. arXiv. Preprint posted online on April 20, 2018. [CrossRef]
- Egger J, Gsaxner C, Chen X, Bian J, Kleesiek J, Puladi B. Apple Vision Pro for healthcare: "the ultimate display"? - entering the wonderland of precision medicine. arXiv. Preprint posted online on August 8, 2023. [CrossRef]
- Ienaga N, Bork F, Meerits S, Mori S, Fallavollita P, Navab N, et al. First deployment of diminished reality for anatomy education. 2016. Presented at: 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR-Adjunct); September 19-23, 2016;294-296; Merida, Mexico. [CrossRef]
- Iriqat S, Vatansever F. Comparison of reality types. Uludağ University Journal of The Faculty of Engineering. Dec 31, 2020;25(3):1155-1168. [CrossRef]
- Hutzler F. Reverse inference is not a fallacy per se: cognitive processes can be inferred from functional imaging data. Neuroimage. Jan 1, 2014;84:1061-1069. [FREE Full text] [CrossRef] [Medline]
- Wang G, Badal A, Jia X, Maltz JS, Mueller K, Myers KJ, et al. Development of metaverse for intelligent healthcare. Nat Mach Intell. Nov 2022;4(11):922-929. [CrossRef] [Medline]
- Zentrum für virtuelle und erweiterte Realität in der Medizin. Universitätsklinikum Essen. URL: https://zvrm.ume.de/ [accessed 2023-01-23]
Abbreviations
AI: artificial intelligence |
AR: augmented reality |
AV: augmented virtuality |
DR: diminished reality |
DV: diminished virtuality |
MR: mixed reality |
POV: point of view |
RV: reality-virtuality |
VR: virtual reality |
XR: extended reality |
Edited by T Leung; submitted 19.09.23; peer-reviewed by M Hirota, K Sharrad; comments to author 24.10.23; revised version received 14.11.23; accepted 26.12.23; published 31.01.24.
Copyright©Jan Egger, Christina Gsaxner, Jens Kleesiek, Behrus Puladi. Originally published in JMIR XR and Spatial Computing (https://xr.jmir.org), 31.01.2024.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR XR and Spatial Computing, is properly cited. The complete bibliographic information, a link to the original publication on https://xr.jmir.org/, as well as this copyright and license information must be included.