Within the last two years many companies had to ask their customers to sign the SEPA Direct Debit Mandates. It is an established procedure to send out forms with filled customer data (the SEPA Mandate). The customer signs the mandate and sends it back to the company.
One of our customers – an insurance company – is using Kofax Capture and Kofax Transformation Modules for mailroom automation. In this context the SEPA Mandates had to be recognized by KTM und the appropriate business process had to be triggered.
Till then two processes for SEPA Mandates were established:
- The customer has signed the mandate: the flag ‘SEPA Mandate was granted’ is set. No further action is needed.
- The customer did not sign the mandate: further administrative processing must be started.
Within KTM the recognition of the signature is done by an advanced zone locator and blackness values of zones.
In the course of time this concept was diluted. On one hand the customers changed or supplemented the filled in customer data by handwritten comments, because the data was wrong or incomplete. On the other hand, some customers received blank SEPA Mandates, which were filled by the customer with handwritten text.
Thus the insurance company was in need of another process for SEPA Mandates:
- customer has signed the mandate, but within the ‘customer data’ region of the mandate handwritten notes were added. A process to change or add customer data must be started.
The challenge for KTM was to recognize if there are handwritten notes in defined regions of the SEPA Mandate.
This is an example of a filled mandate, which was signed by the customer:
And now an example with handwritten changes by the customer:
To identify the handwritten notes you can use the OCR engine ‘Mixed Print’. This engine is for reading typescript and handwriting on a document. However we are not interested in the content of the handwritten notes – we just want to know if there are handwritten notes at all. The ‘Mixed Print’ engine won’t give good results for the content of the written notes, as in these cases typescript and handwriting will often overlap.
But the ‘Mixed Print’ engine provides information, if there was handwriting at all. Candidates for handwritten notes are marked with so-called ‘boxes’ You can view these ‘boxes’ by using the XDOC browser, which comes with the KTM installation. First, you run the ‘Mixed Print’ engine on the mandate document (you can do that in the KTM Project Builder). Then you start the XDOC browser to open the xdc file of the mandate document:
‘Representation 0’ (the ‘Mixed Print’ engine) has three ‘boxes’. Each box stands for a region with candidates for handwriting. These ‘boxes’ can be retrieved by KTM scripting. By selecting a region within the mandate where you look for the ‘boxes’, everything is ready to judge if somebody scribbled on your form.
To define the ‘search region’ you could use the words ‘one-off payment’ (upper right corner, defines upper bound of search region) and ‘By signing this mandate form’ (text underneath the customer data, defines lower of search region). To find this words you could use format locators or search directly within the OCR result of the document. The following scripting example looks directly into the OCR result. The function Is_handwritten returns TRUE, if at least one ‘box’ is found within the search region.
The example script needs a reference to ‘Kofax memphis Forms 4.0’. So please add this reference in your KTM script:
The underlying KTM project uses OCR recognition with RecoStar or FineReader by default. To check if somebody scribbeled on the mandate you may use the following function:
Function Is_handwritten(pXDoc As CASCADELib.CscXDocument) As Boolean
'Checks is something handwritten is in a region of the page
Dim i As Integer
Dim BoxAnzahl As Integer
Dim StartTOP As Long
Dim EndeTOP As Long
'Search 'one-off payment' and add 80 to TOP. Only look south.
For i=0 To pXDoc.TextLines.Count-1
If InStr(LCase(pXDoc.TextLines(i).Text),"one-off payment")>0 Then
StartTOP=StartTOP+80 '~ line height
'Search 'By signing this mandate form'. Only look north of this.
For i=0 To pXDoc.TextLines.Count-1
If InStr(LCase(pXDoc.TextLines(i).Text),"By signing this mandate form")>0 Then
'Re-OCR with engine 'Mixed Print'
FullPageRecognition_1(pXDoc, "", "Mixed Print")
'only count boxes south of StartTOP
'only count boxes north of EndeTOP
'Box.width>200 to avoid 'dirt'
'Box.left>275 to leave out the left border (holes, barcodes)
For i= 0 To pXDoc.Boxes.Count-1
If pXDoc.Boxes.ItemByIndex(i).Top>StartTOP And pXDoc.Boxes.ItemByIndex(i).Width>200 And pXDoc.Boxes.ItemByIndex(i).Left>275 And pXDoc.Boxes.ItemByIndex(i).Top<EndeTOP Then
'OCR back to RecoStar or FineReader, for standard processing
FullPageRecognition_1(pXDoc, "", "RecoStar")
If BoxAnzahl>0 Then 'at least one box: there was some handwriting!
And finally the called procedure FullPageRecognition_1, which does an Re-OCR:
Public Sub FullPageRecognition_1(ByVal pXDoc As CscXDocument, ByVal ImageCleanProfile As String, ByVal OCRProfile As String)
'remove existing OCR results and perform OCR on page one with profile OCRProfile
Dim i as Integer
Dim oPRP As IMpsPageRecogProfile
Dim oPR As New MpsPageRecognizing
'OCR only on page 1
'# Remove any representations, before proceeding to perform full page recognition
For i = pXDoc.Representations.Count -1 To 0 Step -1
Set oPRP = Project.RecogProfiles.ItemByName(OCRProfile) '# Use the page recognition profile OCRProfile
oPR.Recognize(pXDoc, oPRP, 0) '# Perform recognition on the first page
'# At design time the text lines need to be analysed. At runtime this will be done automatically
If Project.ScriptExecutionMode = CscScriptExecutionMode.CscScriptModeServerDesign Then pXDoc.Representations(0).AnalyzeLines
Older blog articles about KTM and KC:
Kofax Transformation Modules (KTM): ‘free-form recognition’ for handwritten numbers
Kofax Capture – Document Separation and Barcodes
KTM and insurance companies: Document Process Automation
Document classification with Kofax Transformation Modules (KTM)
Kofax Transformation Modules – format locators and dynamic regular expressions – Part 2
Kofax Transformation Modules – format locators and dynamic regular expressions
IBM Content Collector for SAP (formerly known as IBM CommonStore for SAP), Kofax Capture 10 and the IBM CommonStore Release Script