Kofax Transformation Modules: SEPA Mandates and handwritten additional information – or: who scribbled on my form?

19.2.2016 | 5 minutes of reading time

Within the last two years many companies had to ask their customers to sign the SEPA Direct Debit Mandates. It is an established procedure to send out forms with filled customer data (the SEPA Mandate). The customer signs the mandate and sends it back to the company.

One of our customers – an insurance company – is using Kofax Capture and Kofax Transformation Modules for mailroom automation. In this context the SEPA Mandates had to be recognized by KTM und the appropriate business process had to be triggered.

Till then two processes for SEPA Mandates were established:

  • The customer has signed the mandate: the flag ‘SEPA Mandate was granted’ is set. No further action is needed.
  • The customer did not sign the mandate: further administrative processing must be started.

Within KTM the recognition of the signature is done by an advanced zone locator and blackness values of zones.

In the course of time this concept was diluted. On one hand the customers changed or supplemented the filled in customer data by handwritten comments, because the data was wrong or incomplete. On the other hand, some customers received blank SEPA Mandates, which were filled by the customer with handwritten text.

Thus the insurance company was in need of another process for SEPA Mandates:

  • customer has signed the mandate, but within the ‘customer data’ region of the mandate handwritten notes were added. A process to change or add customer data must be started.

The challenge for KTM was to recognize if there are handwritten notes in defined regions of the SEPA Mandate.

This is an example of a filled mandate, which was signed by the customer:

And now an example with handwritten changes by the customer:

To identify the handwritten notes you can use the OCR engine ‘Mixed Print’. This engine is for reading typescript and handwriting on a document. However we are not interested in the content of the handwritten notes – we just want to know if there are handwritten notes at all. The ‘Mixed Print’ engine won’t give good results for the content of the written notes, as in these cases typescript and handwriting will often overlap.

But the ‘Mixed Print’ engine provides information, if there was handwriting at all. Candidates for handwritten notes are marked with so-called ‘boxes’ You can view these ‘boxes’ by using the XDOC browser, which comes with the KTM installation. First, you run the ‘Mixed Print’ engine on the mandate document (you can do that in the KTM Project Builder). Then you start the XDOC browser to open the xdc file of the mandate document:

‘Representation 0’ (the ‘Mixed Print’ engine) has three ‘boxes’. Each box stands for a region with candidates for handwriting. These ‘boxes’ can be retrieved by KTM scripting. By selecting a region within the mandate where you look for the ‘boxes’, everything is ready to judge if somebody scribbled on your form.

To define the ‘search region’ you could use the words ‘one-off payment’ (upper right corner, defines upper bound of search region) and ‘By signing this mandate form’ (text underneath the customer data, defines lower of search region). To find this words you could use format locators or search directly within the OCR result of the document. The following scripting example looks directly into the OCR result. The function Is_handwritten returns TRUE, if at least one ‘box’ is found within the search region.

The example script needs a reference to ‘Kofax memphis Forms 4.0’. So please add this reference in your KTM script:

The underlying KTM project uses OCR recognition with RecoStar or FineReader by default. To check if somebody scribbeled on the mandate you may use the following function:

1Function Is_handwritten(pXDoc As CASCADELib.CscXDocument) As Boolean
2'Checks is something handwritten is in a region of the page
4Dim i As Integer
5Dim BoxAnzahl As Integer
6Dim StartTOP As Long
7Dim EndeTOP As Long
14'Search 'one-off payment' and add 80 to TOP. Only look south.
15For i=0 To pXDoc.TextLines.Count-1
16   If InStr(LCase(pXDoc.TextLines(i).Text),"one-off payment")>0 Then
17      StartTOP=pXDoc.TextLines(i).Top
18      StartTOP=StartTOP+80 '~ line height
19      Exit For
20   End If
23'Search 'By signing this mandate form'. Only look north of this.
24For i=0 To pXDoc.TextLines.Count-1
25   If InStr(LCase(pXDoc.TextLines(i).Text),"By signing this mandate form")>0 Then
26      EndeTOP=pXDoc.TextLines(i).Top
27      Exit For
28   End If
31'Re-OCR with engine 'Mixed Print'
32FullPageRecognition_1(pXDoc, "", "Mixed Print") 
34'only count boxes south of StartTOP
35'only count boxes north of EndeTOP
36'Box.width>200 to avoid 'dirt'
37'Box.left>275 to leave out the left border (holes, barcodes)
39For i= 0 To pXDoc.Boxes.Count-1
40   If pXDoc.Boxes.ItemByIndex(i).Top>StartTOP And pXDoc.Boxes.ItemByIndex(i).Width>200 And pXDoc.Boxes.ItemByIndex(i).Left>275 And pXDoc.Boxes.ItemByIndex(i).Top<EndeTOP Then
41      BoxAnzahl=BoxAnzahl+1
42   End If
45'OCR back to RecoStar or FineReader, for standard processing
46FullPageRecognition_1(pXDoc, "", "RecoStar")
48If BoxAnzahl>0 Then 'at least one box: there was some handwriting!
49   Is_handwritten= True
51   Is_handwritten= False
52End If
53End Function

And finally the called procedure FullPageRecognition_1, which does an Re-OCR:

1Public Sub FullPageRecognition_1(ByVal pXDoc As CscXDocument, ByVal ImageCleanProfile As String, ByVal OCRProfile As String)
2   'remove existing OCR results and perform OCR on page one with profile OCRProfile
3   Dim i as Integer
4   Dim oPRP As IMpsPageRecogProfile
5   Dim oPR As New MpsPageRecognizing
7   'OCR only on page 1
8   pXDoc.CDoc.Pages(0).SuppressOCR=False
10   '# Remove any representations, before proceeding to perform full page recognition
11   For i = pXDoc.Representations.Count -1 To 0 Step -1
12       pXDoc.Representations.Remove (i)
13   Next
15   Set oPRP = Project.RecogProfiles.ItemByName(OCRProfile)              '# Use the page recognition profile OCRProfile
16   oPR.Recognize(pXDoc, oPRP, 0)                                        '# Perform recognition on the first page
18   '# At design time the text lines need to be analysed. At runtime this will be done automatically
19    If Project.ScriptExecutionMode = CscScriptExecutionMode.CscScriptModeServerDesign Then pXDoc.Representations(0).AnalyzeLines
20End Sub

Older blog articles about KTM and KC:

Kofax Transformation Modules (KTM): ‘free-form recognition’ for handwritten numbers

Kofax Capture – Document Separation and Barcodes

KTM and insurance companies: Document Process Automation

Document classification with Kofax Transformation Modules (KTM)

Kofax Transformation Modules – format locators and dynamic regular expressions – Part 2

Kofax Transformation Modules – format locators and dynamic regular expressions

IBM Content Collector for SAP (formerly known as IBM CommonStore for SAP), Kofax Capture 10 and the IBM CommonStore Release Script

share post




Gemeinsam bessere Projekte umsetzen

Wir helfen Deinem Unternehmen

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.