Featured image of post OpenRefine

OpenRefine

Data cleansing and transformation tool for tabular data

Learn More Use Tool

What is this tool?

OpenRefine is a tool for cleaning (deduplication, inconsistency correction, formatting) and transforming tabular data in the browser. It can process large volumes of data in batch, significantly reducing the need for manual corrections.

It can save the state of your cleansing work on the server side, and also record cleansing procedures for reapplication to similar files.

Features

  • Filtering and faceting for narrowing down and reviewing values
  • Column splitting/merging, value replacement, whitespace and symbol normalization
  • Duplicate detection and clustering for inconsistency correction
  • Batch transformation using expressions (GREL)
  • Reconciliation with external data sources

How to use

    1. Load data in CSV / TSV / Excel / JSON or other formats
    1. Use Facets and filters to identify problematic values
    1. Clean up data using transformations, replacements, and clustering
    1. Export in the desired format

Data formats

  • Input: CSV, TSV, Excel (xls/xlsx), Google Sheets, JSON, XML, OpenDocument, etc.
  • Output: CSV, TSV, Excel, JSON, etc.

Official documentation site

Last updated on 2026-03-06