SlideShare a Scribd company logo
DataImportHandler
- Dikchant Sahi
Introduction

Imports data from RDBMS/XML into Solr using
configuration

Import works across multiple tables

Data is denormalized

Supports full and incremental update

Allows to plugin components

Is a contrib module
Full Import

Indexes complete data to Solr

command=full-import

Updates the dataimport.properties
Delta Import

Incremental Update

command=delta-import

Tables require additional column last_modified
timestamp

Relies on dataimport.properties file, which
keeps the last indexed time
Configuration Steps

Configure the DIH in SolrConfig.xml
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">/path/to/data-config.xml</str>
</lst>
</requestHandler>

Configure the datasource and data-config.xml

Add the dependencies to the lib
directory
Extending the API

Transformers

EntityProcessors

DataSource

EventListeners
Transformers

Modifies the value of a field or creates a new
field altogether

Transformers can be chained

Some built-in transformer:

RegexTransformer

DateFormatTransformer

TemplateTransformer

HTMLStripTransformer
Custom Transformer

Write your own transformer for custom
processing before adding the row to Solr

Extend Transform and override
transformRow()
Questions
Thank You!
Solr data importhandler

More Related Content

PPTX
Mule for each scope header collection
PPT
EUROCONTROL LARA - Presentation
PPTX
Mule – header collection
PPTX
Mule ESB - Consuming RESTful WS with RAML Definition
PPTX
Header collection
PPTX
Mule for each scope header collection
PPTX
The API Journey: from REST to GraphQL
PDF
OpenStack Ceilometer
Mule for each scope header collection
EUROCONTROL LARA - Presentation
Mule – header collection
Mule ESB - Consuming RESTful WS with RAML Definition
Header collection
Mule for each scope header collection
The API Journey: from REST to GraphQL
OpenStack Ceilometer

What's hot (11)

PDF
Rails on Rack
PPTX
ELK - from zero to coding class hero
PPTX
LowlaDB intro March 2015
PPTX
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
PPTX
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
PPTX
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
PDF
You got schema in my json
PPTX
Mule tcat server - Server profiles
PPT
ApexMeetup Geode - Talk2 2016-03-17
PDF
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
ODP
Introduction to Akka Streams [Part-II]
Rails on Rack
ELK - from zero to coding class hero
LowlaDB intro March 2015
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
You got schema in my json
Mule tcat server - Server profiles
ApexMeetup Geode - Talk2 2016-03-17
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Introduction to Akka Streams [Part-II]
Ad

Similar to Solr data importhandler (20)

PPTX
Ado.net
PDF
OGSA-DAI DQP: A Developer's View
PPT
Windows Mobile 5.0 Data Access And Storage Webcast
PPT
ADO.Net Improvements in .Net 2.0
PPT
the .NET Framework. It provides the claf
PDF
MuleSoft London Community February 2020 - MuleSoft and OData
PPTX
Ch 7 data binding
PDF
WORKING WITH FILE AND PIPELINE PARAMETER BINDING
PPT
Application development using Microsoft SQL Server 2000
PPTX
Web Technologies - forms and actions
PPTX
Disconnected Architecture and Crystal report in VB.NET
ODP
PDF
Airflow tutorials hands_on
PPTX
QTP Automation Testing Tutorial 7
ODP
PPT
Mobile
PDF
SQLite in Adobe AIR
PPT
Chapter 4 event it theory programming.pptx
PPT
Tony Jambu (obscure) tools of the trade for tuning oracle sq ls
PPTX
Ado.net
OGSA-DAI DQP: A Developer's View
Windows Mobile 5.0 Data Access And Storage Webcast
ADO.Net Improvements in .Net 2.0
the .NET Framework. It provides the claf
MuleSoft London Community February 2020 - MuleSoft and OData
Ch 7 data binding
WORKING WITH FILE AND PIPELINE PARAMETER BINDING
Application development using Microsoft SQL Server 2000
Web Technologies - forms and actions
Disconnected Architecture and Crystal report in VB.NET
Airflow tutorials hands_on
QTP Automation Testing Tutorial 7
Mobile
SQLite in Adobe AIR
Chapter 4 event it theory programming.pptx
Tony Jambu (obscure) tools of the trade for tuning oracle sq ls
Ad

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Encapsulation theory and applications.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPT
Teaching material agriculture food technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
MIND Revenue Release Quarter 2 2025 Press Release
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Encapsulation theory and applications.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Network Security Unit 5.pdf for BCA BBA.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Programs and apps: productivity, graphics, security and other tools
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Assigned Numbers - 2025 - Bluetooth® Document
Teaching material agriculture food technology
20250228 LYD VKU AI Blended-Learning.pptx
A Presentation on Artificial Intelligence
Mobile App Security Testing_ A Comprehensive Guide.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Building Integrated photovoltaic BIPV_UPV.pdf

Solr data importhandler