Connecting MongoDB using IBM DataStage

Introduction

MongoDB is an open-source document- oriented schema-less database system. It does not organize the data using rules of a classical relational data model. Unlike other relational databases where data is stored in columns and rows, MongoDB is built on the architecture of collections and documents. One collection holds different documents and functions. Data is stored in the form of JSON style documents. MongoDB supports dynamic queries on documents using a document based query language like SQL.

Why MongoDB?

For the past two decades we have been using Relational Database as data store as they were the only option that was available. But with the introduction of NoSQL, we have more options based on the requirement. Mongo DB is predominantly used in insurance and travel industry.

MongoDB integration with IBM DataStage

Since we don’t have a specific external stage in IBM DataStage tool to integrate MongoDB, we are going with Java Integration stage to load or extract data from MongoDB.

Prerequisites

  1. Make sure you have java installed on your machine.
  2. Install Eclipse tool.
  3. Java requires below MongoDB jar to be imported inside the package to use MongoDB functions
  4. Also, Java requires below jar file to be imported inside the package to extract or load data from DataStage

Illustration of a DataStage job

  • Place the LoadParty.jar and mongo-java-driver-2.11.3.jar in the DataStage server at any location.

Conclusion

Currently there is no external stage for MongoDB in DataStage. Extract/Load from MongoDB in DataStage would become simpler if there is any external stage introduced in future.

--

--

Developer and Designer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store