In one of my previous post Installing queXF in Ubuntu, I wrote about queXF and its installation process. In this post I will write about how to use queXF.
According to queXF documentation, forms created with queXML works with queXF. From queXF documentation: “Please note that the author has not tested a form that was not created in queXML, therefore can make no guarantees that it will work (Although it should)” .
So I decided to use queXML for creating forms. Fastest way was to use the test_questionnaire.xml file that comes with queXML as the starting point for creating my forms.
For using queXML locally, I need to have Apache FOP 0.94 (0.95 will also work, I tested with 0.95. But version 1.0 didn’t work for me), barcode4j 2.0.
Installation:
- We need to have Java installed.
- Download and extract Apache FOP, and add the directory to your PATH (optional),
e.g.
I have extracted fop-0.95-bin.zip to /usr/local/fop-0.95/ and added this directory to my PATH variable. - Download and extract barcode4j-2.1.0-bin.zip inside FOP directory.
e.g.
In my case I copied it to /usr/local/fop-0.95/barcode4j-2.1.0 - Download Barcode4j extensions for Apache FOP barcode4j-fop-ext-complete-2.0.jar file from
http://mirrors.ibiblio.org/pub/mirrors/maven2/net/sf/barcode4j/barcode4j-fop-ext-complete/2.0/barcode4j-fop-ext-complete-2.0.jar
and copy it to /usr/local/fop-0.95/barcode4j-2.1.0/build folder - Edit the fop file (in my case it is /usr/local/fop-0.95/fop) and add the barcode4j classpath
- Download and extract queXML-1.1.0
Installation part is over, now I will edit the test_questionnaire.xml file of queXML and will create a new xml file (say test_questionnaire1.xml) and using this xml file I will generate the pdf form (say newtest.pdf ).
My test_questionnaire1.xml file looks like:
I will use fop to generate the pdf file
root@ubuntu3:~# fop -xml quexml-1.3.10/test_questionnaire1.xml -xsl quexml-1.3.10/to_form.xslt -pdf newtest.pdf -param questionnaireId 197 -param show_cover_page false
So the above command will generate our form as newtest.pdf file, and I will use this form in queXF. questionnaireId is a number that I gave to my form so that this form can be identified uniquely using Barcode.
The form generated by above command looks like
Now I will import this new form into queXF, go to you queXF site and to the admin console (e.g. http://192.168.10.179/admin), click the link Import a new form from a PDF file
Browse and select the pdf file (newtest.pdf) that we created. Enter some description for this form, so that later we can identify the form with this description.
Our form has been uploaded successfully and the barcode is also detected. Now click on Continue by setting up page edge detection (page setup) link to setup the page.
We will see links for each page of the form (with page number as 1, 2… ), as our form has only a single page so we can see only one page 1, click on the page number to go to the page.
Here green square boxes are to detect the edges of the form, if they appear to be in the proper position then no need to move or resize them. Blue lines in this page should appear to overlay over the corner edges of the form and this means queXF detected the corner lines on the form. If all the edges of the form is detected correctly, then click on Finished page setup link.
After page setup, we will band the form. Banding a form means we will mark the different fields of form that are going to be filled up.
Click on Continue with banding to go to banding process.
Banding a form works in two steps:
1) Identifying the fields of the form
2) Assigning field names and type of the fields to each identified field.
Click on the page number to go to that particular page of the form and band the page.
To identify a field, click on the upper left corner of the field (outside the field boxes) once and drag till it covers the whole field (i.e. till bottom right corner of the field, outside the field boxes).
Here we identified the First Name field of the form.
Then right click on the field and select the field’s type.
Enter the name for the field.
Once banding is completed for the form, we are going to add operators (operators verify the content of a filled uploaded form) for this form. Click on the Add operators link to add a new operator.
Enter the username and name for the user and click on the Add user button. But remember we have to create a user with the same name in Apache also. See my previous post Installing queXF in Ubuntu for creating user and using authentication in Apache.
So the new operator is created and we are going to assign this operator to a particular form, so that this operator can verify the filled up forms. Click on the Assign forms to operators link
We imported our form as MSCIT, and the operator Pranab is going to verify the successfully imported forms of MSCIT. Enable the checkbox for the operator and click on the Assign verifier to questionnaire button.
We will take printout of the pdf form (newtest.pdf file that we created) and will let our users to fill up the form. Once the forms are filled up, we will scan those forms as pdf files and going to import into queXF.
To import the filled up forms, click on the Import a directory of PDF files link
The default import directory is queXF Root/doc/filled (in my case it is /var/www/doc/filled). We can change the directory if required. We are going to upload our filled up scanned pdf files into this directory (using FTP or SCP). We can run the import process manually by clicking the Process directory: browser window must remain open button. Also we can run the import process on background by clicking on the Watch this directory in the background (recommended) button. For this example I am going to use the first one.
We can see the message of importing (I have uploaded 10 forms).
Sometimes we may get the message while importing a form Finding qid...Could not get qid... , it is basically comes if queXF is unable to read the barcode of the form.
If some forms fail to get imported, these forms will be listed in the Failed imported files link, here we can set whether we can again import a failed form.
Our forms are successfully imported, now it is the turn of the operator to verify the uploaded forms. Open the queXML site and go to the Verify link
It will show us how many forms are there to verify. Right now we have 10 forms to verify. Click on the Assign next form link to assign a form for verification.
Against each field, the operator have to enter the value for each box of the field. As it is the initial stage of queXF, the ICR process is not trained. So the auto recognition of the filled up characters will not take place.
Note: The ICR process in queXF may need approximately 400 instances of each character to achieve good recognition
Initially feed all the fields with correct characters. ICR will depend on the correctness of the operator’s verification phase. If wrong characters are entered during this phase, then ICR training phase will also have wrong character training and as a result we will have wrong character recognition by the ICR process. So carefully verify all the fields of the form. For navigating inside a form follow the queXF Administration Manual
Once we verify and fill up all the fields of the form, we will be in the following page. Here you can submit the completed form to database, review all the fields of the form or clear all previous entries of the form and start verification again. We will complete the verification process for this form, so click on Submit completed form to database link.
We can go to the next form by clicking Assign next form link.
Once forms are verified, we are going to train the ICR process. Go to the link Train ICR.
In the Train ICR link, we will see the available forms in the system. Click on the form name to start ICR training from inputs of this form.
Now I have to choose which verifier’s inputs are to be included in this training. Select the verifier(s) and click on Continue training button.
In ICR training, we have to choose the characters to be included in the training. In the table we can see the characters and number of instances of each character. We can select the characters by clicking the Include in training checkbox for each character and start the training by clicking Start training process in background button. This will run the ICR training process in background and without any input from us. But if we want to verify that each instances of a character is correctly entered and detected by ICR training process, then we have to manually train the process. Click on the Manually train link for a character to start the manual training process for that character.
e.g. for 0 character, we can check whether all the characters are correctly entered and detected. If everything is fine, click on the Train button start the training process.
If some instances of a character is wrongly detected, we can remove those instances from training by clicking on wrongly detected instances of the character. On clicking the character instance will turn into red color from green color, indicating this character will not be added to training process. Also we can correct the character instance, if the character instance is wrongly detected. For correcting just enter the correct character at the small text field below the character instance.
After training process for a character is completed, we can see how many characters were added to the ICR KB.
Once the ICR process is sufficiently trained, we can see auto detection of characters in our filled up forms when we open a form for verification.
Also we can download the form data from the Output unverified data and Output data/ddi link.
The Output unverified data link contains data from the forms, which are successfully imported to the system. This data is automatically detected by system and it is not verified by the operator.
The Output data/ddi link contains data which is already verified by the operator.
We can download the data in various formats.
Sample data downloaded for our form in CSV format.