GSoC :: Coding Period – Phase One (May 14th to June 12th): Initial implementation of typewriter annotation tool in Okular

Hi everyone,

The phase one of the coding period is now completed and I’m done with the initial implementation of typewriter annotation tool in Okular along with writing the integration tests for the same. I have created the revision on Phabricator and it is currently under review. Some review comments by my mentor are still to come.

As per the agreed timeline, I have implemented the fully functional typewriter tool that creates the annotation with the transparent background in all the supported document formats and the text input UI in the current implementation is the popup QInputDialog window which is in accordance with the inline note.

This is how it works:

Thanks to Tobias Deiminger, my mentor, and other Okular developers who helped me in all the ways whenever I was stuck anywhere.

The typewriter tool icon is inspired from Adobe Reader’s one and currently, we are missing a number of vital features in others annotations plus the typewriter annotation which we have planned to complete in the next phase. I need to do some fixes before proceeding to the other goals of this project.

The first 15 days of the phase one were quite busy for me as I had my college exams and so I only devoted 15 hours a week and the last 10 days were spent in figuring out how to write the tests and in writing a few of them. Following is our next plan:

  • Font color implementation in Poppler
  • Font color chooser in typewriter annotation’s settings dialog
  • Respect font family in Poppler
  • Writing integration tests

And yes, after this phase, we will try some WYSIWYG approach for our typewriter annotation.

You can track my commits at https://cgit.kde.org/okular.git/log/?h=gsoc2018_typewriter

Feedbacks and suggestions are always welcome :)

Wait for the next post…

 

FreeText typewriter annotation WYSIWYG implementation ideas

As a part of the GSoC project, I’m working with my mentor Tobias Deiminger on implementing the FreeText typewriter annotation with click-to-type WYSIWYG editing feature in Okular to write directly on PDF page. Here we have come with the following high-level implementation ideas:

Idea 1: Dedicated annotation widgets

This is inspired by the existing FormWidget implementation. FormWidgets have a unified interface towards PageView. They scroll and scale with the viewport and let the user edit content in place. They come in different flavors, e.g. radio buttons and text input. Especially the in-place text input sounds quite like what we want for the typewriter. So let’s get a similar TypewriterWidget and making all annotations separate widgets could be a path towards better design. The typewriter can be the first doing it in real.

Idea 1.1: Custom typewriter widget rendered by the generator

With every text input keystroke, TypewriterWidget shall update the underlying annotation object. The TypewriterWidget::paintEvent shall request an up-to-date QImage with the single annotation from poppler and paint that. There is only “splash” rendering engine at work and hence the annotation is rendered by it.

PDF supports saving annotations into the document and defines the native rendering, all is done by means of poppler. Currently, in the print and PageView::paintEvent case, the generator renders annotations.

This idea probably requires something like Annotation::renderToImage to save CPU time, because Page::renderToImage says “x, y, w, h are not well-tested” and it is rather expensive. It is the best WYSIWYG approach to write directly on PDF page but implementing it might take a long time.

Idea 1.2: Use KTextEdit during user input

Here we have to relinquish WYSIWYG during user input phase. It’s more like “what you see during user input is similar to what you get when printed or page is painted”. There’s a flag “do not render this annotation by generator”. Let’s set it while user input is in progress. After finishing user input, switch to generator rendering again.

Here is the output from an experimental code:

You can see that visible differences are the downside of this idea before and after the editing. We should at least fine-tune KTextEdit to make it less notable. This way we could get rid of the position offset and have the more similar font. But FreeText ignores the font and causes the most notable difference. See https://bugs.freedesktop.org/show_bug.cgi?id=81748#c1.

There’s another more fundamental issue in the above example: It’s just different rendering engines at work. In KTextEdit the font is rendered by the Qt paint system. After you switch to poppler, the annotation is rendered by “Splash”. Both may or may not use libfreetype (see https://doc.qt.io/qt-5.10/qtgui-attribution-freetype.html and poppler/splash/SplashFTFontEngine.cc). So algorithms like for hinting and anti-aliasing may or may not differ. And poppler uses styling rules from the PDF, whereas KTextEdit carries its own set of styling rules.

Idea 2: Don’t make typewriter UI a QtWidget

Instead, we handle input events, state, and painting directly in PageView. Keep the idea that we request a QImage with a single annotation from poppler and paint that. This makes PageView a bit worse again, but the approach is in good tradition.

Problems

There’s class MouseAnnotaiton which handles selecting, moving and resizing annotations. It was not designed to work with widgets. It just draws into the PageView. We either have to redesign it or duplicate relevant events; send one to MouseAnnotaiton, and the same to TypewriterWidget. Can’t say anything about the mess it will create.

 

 

 

 

The beginning of GSoC 2018 with KDE

The result for GSoC 2018 was set to be announced on April 23, 2018 at 21:30 (+0530 GMT). When the time arrived for the announcement, and to browse the list checking for self’s name, the time almost got stopped for me. The wind of excitement, fear and tension was spread into all the corners of my spine.

Well, after next 5 minutes I was finally able to figure out, that my project proposal got selected. It was an awesome feeling to finally see my name listed.

My Proposal

With interest in projects which are related to the desktop application, and to possibly work with C++, my all-time favorite programming language, my main and only choice was to apply with KDE. After days of reading and knowing about their project opportunities, I was left with Okular.

To be honest, I started thinking about GSoC in late Jan and KDE was the only organization that interested me as I was using its products for months. I was having issues with Okular, KDE document viewer, and hence I began sending a few patches in the month of Feb.

Finally, after investing some significant amount of time, I was ready with my proposal for GSoC 2018. The title of my proposal is, “Implementing the FreeText annotation with FreeTextTypeWriter behavior” and here I am also sharing selected proposal, so it can probably be helpful to all.

Selected Project Details

In addition to above details, Okular already implements the FreeText annotation as ‘Inline Notes’ and shows and allows moving, deleting and modifying the existing typewriter annotations but a new one cannot be created and edited with the FreeTextTypeWriter
behavior (WYSIWYG live editing) directly on PDF page. Its annotation toolbar should gain a new entry ‘FreeText’ with the option to configure font size, color, and type along with the auto-resizable rectangle and click-to-type FreeTextTypeWriter behavior to write directly on PDF page.

I would be looking forward to work as good as possible this summer for KDE under my mentor Tobias Deiminger. Thanks to my mentor and org to consider me worthy for their selection list. :)

From Friends :)

Congratulations bro for getting into summer of code…

Posted by Dinesh Mali Dkm on Monday, April 23, 2018

 

 

Beginning as Linux kernel contributor

Submitting the first patch to the Linux kernel was a breathtaking experience. It all started in the second week of Feb. It took me ~1 hour to clone the latest Linux kernel source tree on an 8 Mbps internet connection and then almost a night in order to compile and build the 4.15 Arch Linux kernel. I followed this guide thoroughly: kernelnewbies and also read the first three chapters of this book: Linux Device Drivers, Third Edition. This book introduced me to the device drivers along with the specific types and inserting/removing them as a module during the runtime. The sample code in the book helped me in making a ‘hello world’ driver and experimented with both the insmod and rmmod commands. But the codes in the subsequent chapters are a bit outdated.

Many advised reading the books first on Operating Systems and Linux Kernel Development and then trying contributing whereas a few advised following the kernel newbies’ guide and using the bug finding tools for fixing errors. I followed the latter because looking for the code around the errors and other similar code in different files and spending more time with code and experimenting with it are the best ways for learning and understanding the kernel code and thus helps in finding the area of interest.

My first cleanup was removing the possible unnecessary ‘out of memory’ warning from the usbpipe.c file of vt6656 driver. I submitted my first patch on Feb 10. After adding the changelog, I got a mail from Greg Kroah-Hartmanmy on Feb 12 stating my patch was finally added to the staging-next branch and would be ready to merge in the next major kernel release! And that was the feeling!

I’m looking forward to TODO list after learning, understanding and experimenting with the code and book for a while…

 

My freshman summer internship at Swaayatt Robots

Hello world! This is my foray into this newly created blog.

This blog post is about my experiences and learnings as a freshman intern at Swaayatt Robots, IN. Swaayatt Robots, at the present, is developing on-road self-driving technology that works on unusual Indian roads and environment without LiDAR or RADAR.

Last year, I was looking for an internship in order to get familiar with the industry culture and etiquette and to get some hand on experience but it was the only time when I came in contact with Sanjeev Sharma, the founder, and asked for a position. After a small telephonic interview, I was called for the internship. Back then, I was coding for the past 4 years and was comfortable with C99/C++ 11/14 standard, both python 2 and 3, Linux and the shell and had built some hacks using OpenCV; he mentioned that this made him think about me suitable for the intern job.

He had told me beforehand that I would be coding the SLAM and the lane marker algos as the startup was devising its own algos based on its research work but as soon as I joined there, I was told on the day 0 to develop a self-driving taxi booking android app so that they can foster their concept of the project related to self-driving taxi services and on the same day, he told me to take my time and build a demo app with the basic login activity and to print a login success messege on the console of the server. I went back to my accommodation and spent day 0 and 1 learning JAVA and the basic concepts of Android. I had experienced PHP, SQL and nginx server before and that was why I was able to present that simple app on day 2.

Later on, he gave me a few more tasks to improve the app along with having the option to work in the emergency mode, plotting the stream of GPS coordinates from the server onto the map in android and communicating with the self-driving vehicle. In the meanwhile, I was assigned other tasks including working with OpenGL and Image processing to show the environment around the vehicle, assisting other senior interns in coding the SLAM framework and worked with a senior to implement Harris-Corner Detection algo in another framework.

My senior interns were there for the research internship and they used to help me with the theoretical knowledge that I was lacking in computer vision and machine learning. I actually came to know about the hyped terms like DNN, CNN, RL, DRL and I began to realize how AI is changing the world and how can I begin with ML! I began with Gilbert Strang’s Linear Algebra lectures and was appreciating the subject. I got the slightest idea of research too (though it is still a broader term for an undergrad like me) and even for my own tasks, I read the academic papers for the first time in my life.

I was going overstressed and hence I limited myself to coding and scripting tasks only. Later on, I developed the web version of the same booking app and for that, I had to learn about the web development from the scratch. And in my final days, we were collecting the voice datasets for the NLP task but the task was time-consuming. I thought of speeding up the process and developed two scripts for auto-segmentation and collection. It saved a lot of time for us. You can find these scripts here on github.


  1. Swaayatt’s self-driving technology and research in IoTIndiaMagzine
  2. Article about the startup on Entrepreneur.com
  3. “This maker team from Bhopal is custom-building an autonomous vehicle technology for India” – factordaily
  4. Swaayatt Robots is now in NVIDIA India Startup Inception Program