Making Any Pen SMART!

Location: St Leonards, Sydney NSW

Duration: 5 months

Proposed start date: ASAP

Keywords: Image processing algorithms, Tensorflow framework, Knowledge of any languages C,C++, Python, Java & experience with neural networks

Project Background

Australia is home to some of the most geographically isolated and remote communities in the world. Inkerz’ vision is to enable effective education to all pupils regardless of their location. In the past, students in rural areas had to attend boarding school or receive lessons from a radio or via mail or email, making the process difficult and lacking in engagement. This meant students were either separated from their families or they had no interaction with their teacher and other peers.

Inkerz’ technology will focus on bridging the education gap between remote areas. In turn, Inkerz hope to benefit Australian communities by providing proper access to education for all students regardless of their remoteness. Inkerz hope to collaborate and partner with distance education providers to reach these isolated communities. The initiative to connect remote areas with effective education is also backed by Telstra. Inkerz are an alumnus of Telstra’s muru-D accelerator program for start-ups, making Telstra a shareholder with a vested interest in our company. They are cooperating with Telstra Country Wide division to introduce better distance education technology to remote areas of Australia.

A smart pen+whiteboard is one of the most important items in education and other industries. As an example, in an online classroom in which the existence of a whiteboard is inevitable, the whiteboard provides a collaborative environment and it is a shared area which all the classroom individuals can have access to read or write on. The contents drawn/written by any individual should be visible to all other parties. Inkerz believe that the most important thing in providing a virtual whiteboard is providing the means to get user input and put on the whiteboard area. Possible choices for the input tools include keyboard, mice, digital pen which each has its advantages and disadvantages:

  • Using a keyboard is an easy way of getting the user input which most of the users are accustomed to using it. But this will bring limitations to the individuals in expressing their idea as they cannot draw or provide complex scripts such as math formulas on the whiteboard
  • A mouse can also be used as a tool to provide drawings and writings on the whiteboard, but basically it is almost very hard to write or draw exactly as one expect. A digital/smart pen is one of the tools which is a good choice as a tool in this context
  • The disadvantage to using a smart/digital pen is that most users do not have such devices and providing such tool for each user might bring burden and complexities both to the company and the user
  • Using a camera to detect paper and paper contents is our preferred and proposed method to use as an input device. User is needed to write or draw on a paper and making use of image processing approaches, the writings/drawings are extracted and put on the virtual whiteboard

Research to be Conducted

Inkerz is seeking a PhD candidate who has knowledge in the field of deep learning and image processing to provide methods to use camera as a mouse device, in brief. It is desired to extract user writings/drawing from detected paper on webcam captured images and to reconstruct the paper content from the extracted data on a virtual whiteboard.

Inkerz believe this method to be a breakthrough for subjects like mathematics, economics, physics, creative arts, geography and tech design where handwriting remains vital, and that it is central to all subjects and courses requiring students to write formulas and equations, draw diagrams, compose music, drawing design ideas and simply brainstorm ideas.

To detect handwritten contents on the paper, Inkerz have already developed methods to detect the image area containing the paper content. Using the detected paper area, a perspective transform is calculated, and webcam image is transformed to another image showing a top-view of a paper-only image. In other words, it can be said that it acts like a paper scanner. The perspective transformed images are named as scanned images.

As the webcam image resolution is usually 1280×720 and considering the fact that the paper area usually takes less than 30 percent of the webcam captured image area, the scanned image quality is always lower than the original captured image. Inkerz are seeking for methods to provide high resolution paper contents on the virtual whiteboard. Different approaches can be made in this regard which are mentioned below in order of preference:

Approach 1 (preferred):

Detect written contents on the scanned paper and provide a series of points (positions) which their interconnections reconstruct the same handwritten contents on the whiteboard. In other words, the algorithm output would be a disconnected graph in terms of graph theory.

Approach 2:

Implement methods to improve scanned paper quality by image processing approaches. Methods like deep image prior[1] is of interest.

[1] https://dmitryulyanov.github.io/deep_image_prior

Skills Required

Inkerz are looking for a PhD student with the following skills:

  • Hands-on experience with image processing algorithms preferably opencv
  • Experience with deep-learning frameworks preferably the Tensorflow framework & convolutional neural networks
  • Good knowledge of any of the languages C, C++, Python and Java.

Expected Outcomes

To implement a fast and real time API and corresponding stand-alone example application which inputs noisy images and outputs high quality reconstructed image based on the input.

  • For Approach 1, the algorithm and API should provide a disconnected graph. For the demonstration and proof of concept a standalone application incorporating the provided API should be provided to use the disconnected graph and output the reconstructed image either on the screen or to a file
  • For Approach 2, the algorithm would input a noisy image and output the corresponding high-quality image. A standalone application incorporating the developed algorithm should be provided which either shows the output image on the screen or writes it to a file.
  • It is required to deliver the codes in Python, C/C++ or Java
  • The algorithm(s) should be optimized enough in terms of memory requirements and CPU resource usage so that it can be layer be used in mobile devices. The minimum frame-rate output requirement is at least 8 frames per second in common mobile devices used in the public
  • If a deep-learning approach is used, it is preferred to use the Tensorflow framework

Supporting documentation and the theory and algorithms behind the developed codes are also a requirement.

Additional Details

The intern will receive $3,000 per month of the internship, usually in the form of stipend payments.

It is expected that the intern will primarily undertake this research project during regular business hours, spending at least 80% of their time on-site with the industry partner.  The intern will be expected to maintain contact with their academic mentor throughout the internship either through face-to-face or phone meetings as appropriate. The intern and their academic mentor will have the opportunity to negotiate the project’s scope, milestones and timeline during the project planning stage.

To participate in the APR.Intern program, all applicants must satisfy the following criteria:

  • Be a PhD student currently enrolled at an Australian university
  • PhD candidature must be confirmed
  • Applicants must have the written approval of their Principal Supervisor to undertake the internship. This approval must be submitted at the time of application.
  • Internships are also subject to any requirements stipulated by the student’s and the academic mentor’s university
Applications Close

28 May 2018

Reference

INT – 0427