OCR Jobs Management Portlet
This is a portlet that uses OCR Stateful Web Service to submit,poll status and delete OCR jobs.
Submit a OCR job
To submit a new OCR job, you can use the 'Submit Job' tab of the portlet. The input fields are:
- Job name (optional): this is a way to identify easily the job, if you fill the field the unique job id will have as
prefix this job name.
- Execution type: select where you want the OCRing process to be executed, "gcube" or "glite" worker nodes. In case of glite, you must also upload a proxy file using the "Upload proxy" form.
- Type of job: select the granularity of OCRing: In "Single" mode you give as input a pdf file, in "Bulk" mode you
give as input a zip file of many pdfs.
- Language: select the language of the input pdfs, you can choose between "English","Deutch","French","Italian","Dutch","Spanish".
- Input access: select the way to give the input (pdf file or .zip with many pdfs). You can select "Reference" option to give a http/ftp reference such as "http://dl.dropbox.com/u/19792897/NobelAnnounce.pdf" or "ftp://www.di.uoa.gr/NobelAnnounce.pdf", "CMS Reference" option to give a cms uri such as "cms://14c1fb40-9116-11e0-90f7-ca34f60d2e2d/c7feb1e0-d4bd-11e0-a12e-fda94ff03821", or "Upload" option in which you use the form to upload a file from your filesystem.
- Reference/Upload file: use this field to give a value for the input access you chose previously. In case of "Reference" and "CMS Reference" a textbox is shown, in case of "Upload" a form appears to upload your file.
- Upload proxy: this field only appears if you have selected "glite" as execution type. Use this form to upload your proxy file for the grid.
Example: We want to perform OCRing on a zipped file with many pdfs. We submit a ocr job with job name "many_pdfs", we choose to be executed in gcube nodes, its type to be "Bulk", and we choose to give the zip file through http, so we choose "Reference" as Input Access and give the http reference below. Since the execution type is not "glite", we don't need to provide a proxy file.
After pressing the "Submit" button, we get the job id, which is the job name we chose plus a unique identifier.
Poll status of a OCR job
You can poll the status of previously submitted ocr jobs by using the "Poll status" tab. Choose a job id from the drop-down list and you can see the status of the submitted job.
Or it may have completed execution without errors, in which case you can use the "Download" buttons to retrieve output files of the job:
Delete a OCR job
If you want a OCR job to stop appearing in your listboxes, you can use the "Delete Job" tab. You simply have to choose a job id from the drop-down list and confirm that you want the job to be deleted permanently.