Parsing unstructured street address

Jan 23, 2018
Hi, i'm using PDI as an ETL tool to my datawarehouse. One dataset contains the address and it has been entered manually, so i have an unstructered string variable with all possible entries.

exemple : 12 , rue ibn rochd, avenue moulay smail, Casablanca


                 barnoussi, bloc 12 imm 5 app 6.

                 12 rue ibn koutaiba, casa.

(note : addresses are in casablanca, morocco)

Is there any way to extract the street and the district from an unstructued address perhaps using NLP?

I was thinking of creating a table (from an open source dataset) with all the street and district names of casablanca  (including short forms)  and then if any word in the the dataset matches with a street/district or its short form it fills this into a new column in the target table.

