ARetail allows users to generate a 3D model of a chair from a single 2D image using machine learning and see the generated chair in Augmented Reality.
ARetail is developed by students from NUS High School class of 2019: Ang Kang Rong Roy
, Kyle Zheng Ching Chan
and Andrew Yapp Wei Rong
.
- Take a photo of a chair that you want to generate a 3D model of
- Upload the image to our cloud server
- Open the ARetail mobile app and choose your chair
- Place it in Augmented Reality
This project consists of two parts:
a server
running a Python script that takes an image of a chair as input and generates a 3D model of that chair as output, and
a mobile app
that allows users to upload images of chairs and plot 3D models of chairs in Augmented Reality.
After an image is uploaded by the mobile app to our Firebase Storage
, the server downloads it as the input. The background of the image is first removed using the openCV
library. Empty space is added to either the length or width to resize the image into a square.
**Challenges faced**
We needed to improve the reconstruction to make better-looking 3D models. Pre-processing immensely improved accuracy by removing irrelevant information that would "distract" the neural network.
With the 2D image of a chair as input, we perform 3D reconstruction. We first approximate a triangular mesh that resembles the image using a Perspective Transformer Network
. This Perspective Transformer
Network uses a machine learning model that has been pre-trained on more than a thousand images of chairs from the ShapeNetCore dataset[1]. We then render silhouettes of this mesh using a neural renderer
[2]. For evaluation, the silhouettes and the input image are compared as well as the smoothness of the mesh and the loss is backpropagated to improve the likeness of the mesh to the input image.
This process is iterated to eventually produce a 3D mesh that resembles the input image of a chair from every perspective.
From the pre-processed input image of the chair, we extract the colors of the chair. We then perform style transfer by "wrapping" the colors into the texture of the mesh. We save this colored mesh into a 3D model as a .glb
file and upload it to our Firebase Storage
where it is now accessible by users through the mobile app.
**Challenges faced**
After much trial and error and weeks of frustration, we found out that we had to save the model as a scene instead of an object (which was so counter-intuitive) to retain its colors. A .glb
file contains all data about the mesh and textures, as opposed to a .gltf
file which is not self-contained or a .obj
file that does not contain textures
The Import Activity makes use of intents
to allow users to import images either by taking photos in-app or by sharing images to the ARetail app. The user can input a name, price(optional) and dimensions(optional) for the chair. This image is then uploaded to our Firebase Storage
for the server to generate the 3D model.
The AR Activity makes use of the ARCore
library that allows the plotting of 3D models in Augmented Reality(AR). Live camera feed is used for the AR and flat surfaces such as floors and tables are detected. The user can first choose a chair from a dropdown list, which is populated by the available 3D models in our Firebase Storage
. The user can then place multiple 3D models of chairs on these flat surfaces in AR, with realistic dimensions and textures (reflections and shadows) to see if the chair would fit into their surroundings.
[1] Chang, Angel X., et al. "Shapenet: An information-rich 3d model repository." arXiv preprint arXiv:1512.03012 (2015). [2] Kato, Hiroharu, Yoshitaka Ushiku, and Tatsuya Harada. "Neural 3d mesh renderer." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018).