
Often the existing data has no consistent format being derived from many sources. Or it contains duplicate records/items and may have missing or incomplete descriptions. RKIT’s data cleansing process fixes misspellings, abbreviations, and errors. The data is normalized so that there is a common unit of measure for items in a class, e.g. feet, inches, meters, etc. are all converted to one unit of measure. The values are also standardized so that the name of each attribute is consistent, e.g. inch, in., and the symbol “ are all shown as inch. Data scrubbing, also called data cleansing, is the process of amending or removing data in a database that is incorrect, incomplete, improperly formatted, or duplicated. An organization in a data-intensive field like banking, insurance, retailing, telecommunications, or transportation might use a data scrubbing tool to systematically examine data for flaws by using rules, algorithms, and look-up tables. Typically, a database scrubbing tool includes programs that are capable of correcting a number of specific type of mistakes, such as adding missing zip codes or finding duplicate records. Using a data scrubbing tool can save a database administrator a significant amount of time and can be less costly than fixing errors manually.
Our Data Cleansing Services feature
- The identification and removal of duplicated records
- The comparison and removal of records matching third party information, such as the opt-in and opt-out list
- The removal of spurious and invalid records
- The removal of obsolete data
- The identification and tagging of similar records with subsequent manual review
- Data validation (for example using a post code checker to identify that addresses are correct)