Very common request- how do I extract data from a Signed PDF using livecycle ES
To do this you will need to have livecycle server software installed. This example uses processFormSubmission service operation of the forms component.
Attached is the PDF which explains the process and it also has the process lca and the test file need to run the process
Click here
This process can be used when you are getting the signed PDF from email/watchedFolder. This process can also be used when you are submitting the signed pdf from workspace
-
Search It!
-
Recent Entries
- Translating your Forms into other languages
- Submitting PDF to a servlet
- Extracting Data from Signed PDF using LiveCycle Server
- Adding Attachments to PDF Form
- Emailing to all members of a group
- Using Namespace in Workbench
- Modifying email when emailing a form
- Disabling Form Fields with validation
- Extracting Data From Xml File
- Assembling Task Attachments
-
Links
It works very good!!!! Can you provide a deeper explanation about the use of the namespaces in order to get the data into the variables with the setValue Operation?
Thanks a lot,
Carlos
Hi Carlos
If you look at the data which is extracted from the PDF using the process Form Submission, the data has 2 namespaces defined viz xdp and xfa. Now in order to access the data in the xml, you will also have to define the namespaces. We define the namespaces in our process. The namespace consists of a “Prefix” and URI portion. For example I had the following namespace defined in the process
d http://ns.adobe.com/xdp/. Here d is the namepsace prefix and “http://ns.adobe.com/xdp/” is the namespace URI. If you see in the xml data, you have a namespace called XDP which points to “http://ns.adobe.com/xdp/”
Then in my setvalue I used the “d” prefix to access the xml data. Basically whereever xdp namespace was used, I replaced it with my namespace-d in this case.
let me know if you have any more questions
thanks
girish
Your example works really fine but in my process I’m facing some problems to retrieve and set the variable values from the resultant XML data.
When I print the variables to the log after use the SetValue Operation i get null values (but the XML variable is holding all the XDP with the data and the chunk pdf):
2009-04-28 11:07:18,359 INFO [STDOUT] [PID:5,812] /process_data/apellido_afiliado: null
The XPath expressions i’m using are:
LOCATION
/process_data/@apellido_afiliado
EXPRESSION
/process_data/XML_Data/d:xdp/f:datasets/dd:data/DatosAfiliado/apellido
What i’m doing wrong? Maybe the problem colud be the root node in my schema has not the same name of the root form element in my object’s hierarchy?
If you think that sending all my XML data could be valuable please tell me.
Thanks again for all your help,
Carlos
Hi Carlos
Send me your PDF file to mergeandfuse@gmail.com
Will take a look and send you the solution
thanks
girish
Hi Girish,
Your example is really very nice, In my case we dont want the signed PDF so i ignored that.
We want to export data into excel sheet, so could you please help me on this.
Thanks in Advance.
Sameer
Hi
Are you using livecycle server software?
Yes Girish,
Livecycle ES 8.2
Hi Sameer
Process 1 is easy
Process 2 may not be that easy, (does it have to be excel sheet)
thanks
girish
Hi Girish,
Thanks for your quick response..!!
Actually, I am looking for two processes –
1) that extracts the data from a pdf dropped into the watched folder and save it to the MYSQL db.
2) another process Initiated by user that exports the MYSQL data to Excel sheet.
So please when ever you get free time please help me on this.
Regards,
Sharique
Hi Girish,
Thanks for your support.
I have done the process 1, but not able to do second (Process 2 may not be that easy, (does it have to be excel sheet))
Yes, this data from the MYSql DB has to be expert throgh LC process.
Regards,
Sharique
Hi ,
I am trying get the data in xml form from processformsubmission component for the past 2 days. Please send me the sample @ renjithvijayan2005@gmail.com
thanks
Renjith
can you please explain the use case? Are you getting errors when you use the processformsubmission component? Did you look at my blog post
Click to access extracting-data-from-signed-pdf.pdf
let me know
thanks
girish
Hi Girish ,
I tried your sample program and its works fine. It is exactly what I am looking for. I just want to know how you are defining the name space in the process? Ie: how u are assigning name space to “d” and where?
Hi
If you right click the process and see its properties you should see the namespace defined there
thanks
girish
How can you submit a pdf or other document to a process in workspace?
Hi Girish,
Thanks for the nice post on extracting data from the pdf. I am having difficulties in implementing your example. Basically I had created a process named “processFormSubmission” in my LiveCycle workbench which has 2 activities defined in it, 1. Default start point and the other one processFormSubmission from the Forms service. I had defined the input reference to this process as document which is nothing but the pdf file, content type as pdf, pdf to xdo as true, mapped the output from the process as document which is nothing but the xml data extracted out of the pdf. I am not sure on where to reference the imported ExtractDataFromSignedPDF.lca file in the process and not sure on the namespaces to be created. Without these lca and the namespace, I got the response as “The invocation of this long-lived process returned job-id ” when I invoke the process from the workbench.
Please let me know on what I am doing wrong,
Also please share me the complete steps to extract data from the any input pdf passed into the process. Also let me know whether we would be able to extract xml out of the flattened pdf document or not.
Appreciate your quick support in this. Thanks in advance.