Question 1. What Is Gobblin?
Gobblin is a widespread ingestion framework. It's goal is to tug statistics from any source into an arbitrary records shop. One fundamental use case for Gobblin is pulling data into Hadoop. Gobblin can pull information from document systems, SQL stores, and data this is uncovered with the aid of a REST API.
Question 2. What Programming Languages Does Gobblin Support?
Gobblin presently only supports Java 6 and up.
Core Java Interview Questions
Question 3. Does Gobblin Require Any External Software To Be Installed?
The device that Gobblin is built on must have Java hooked up, and the $JAVA_HOME environment variable must be set.
Question four. What Hadoop Version Can Gobblin Run On?
Gobblin can run on each Hadoop 1.X and Hadoop 2.X. By default, Gobblin compiles against Hadoop 1.2.1, and might compiled against Hadoop 2.Three.0 through going for walks ./gradlew -PuseHadoop2 easy construct.
Core Java Tutorial
Question 5. How Do I Run And Schedule A Gobblin Job?
Check out the Deployment page for data on a way to run and agenda Gobblin jobs. Check out the Configuration page for information on how to set right configuration homes for a job.
MySQL Interview Questions
Question 6. How Is Gobblin Different From Sqoop?
Sqoop essential cognizance bulk import and export of information from relational databases to HDFS, it lacks the ETL functionality of information cleaning, information transformation, and facts high-quality tests that Gobblin offers. Gobblin is also able to pulling from any data source (e.G. Record structures, RDMS, REST APIs).
Question 7. When Running On Hadoop, Each Map Task Quickly Reaches one hundred Percent Completion, But Then Stalls For A Long Time. Why Does This Happen?
Gobblin presently uses Hadoop map duties as a container for walking Gobblin duties. Each map undertaking runs 1 or greater Gobblin workunits, and the development of every workunit isn't hooked into the development of each map project. Even though the Hadoop job reviews 100% completion, Gobblin remains doing paintings.
MySQL Tutorial Framework7 Interview Questions
Question eight. Why Does Gobblin On Hadoop Stall For A Long Time Between Adding Files To The Distrbutedcache, And Launching The Actual Job?
Gobblin takes all WorkUnits created by way of the Source magnificence and serializes every one right into a document on Hadoop. These files are read by every map project, and are deserialized into Gobblin Tasks. These Tasks are then run by means of the map-project. The reason the activity stalls is that Gobblin is writing a majority of these files to HDFS, which could take some time in particular if there are lots of duties to run.
Question nine. How Do I Fix Unsupportedfilesystemexception: No Abstractfilesystem For Scheme: Null?
This mistakes commonly takes place because of Hadoop version struggle problems. If Gobblin is compiled towards a specific Hadoop version, but then deployed on a different Hadoop version or installation, this mistake may be thrown. For example, in case you truly assemble Gobblin the use of ./gradlew smooth construct -PuseHadoop2, however install Gobblin to a cluster with CDH set up, you may hit this error.
It is crucial to recognize that the the gobblin-dist.Tar.Gz file produced through ./gradlew clean build will include all of the Hadoop jar dependencies; and if one follows the MR deployment guide, Gobblin will be released with those dependencies on the classpath.
To restore this take the following steps:
Delete all the Hadoop jars from the Gobblin lib folder
Ensure that the surroundings variable HADOOP_CLASSPATH is set and factors to a directory containing the Hadoop libraries for the cluster
Sqoop Interview Questions
Question 10. How Do I Compile Gobblin Against Cdh?
Cloudera Distributed Hadoop (frequently abbreviated as CDH) is a popular Hadoop distribution. Typically, whilst strolling Gobblin on a CDH cluster it's miles endorsed that one additionally collect Gobblin towards the equal CDH model. Not doing so may additionally purpose sudden runtime conduct. To assemble towards a specific CDH version clearly use the hadoopVersion parameter. For instance, to assemble in opposition to version 2.Five.0-cdh5.Three.0 run ./gradlew easy build -PuseHadoop2 -PhadoopVersion=2.5.0-cdh5.3.Zero.
Resolve Gobblin-on-MR Exception IOException: Not all duties jogging in mapper attempt_id completed effectively
This exception usually simply manner that a Hadoop Map Task jogging Gobblin Tasks threw some exception. Unfortunately, the exception is not in reality indicative of the underlying hassle, all it's far absolutely announcing is that some thing went wrong within the Gobblin Task. Each Hadoop Map Task has its own log report and it's far regularly simplest to observe the logs of the Map Task whilst debugging this trouble. There are multiple approaches to do that, but one of the simplest approaches is to execute yarn logs -applicationId <application ID> [OPTIONS]
Gradle Build Fails With Cannot invoke method getURLs on null item
Add -x check to construct the project without running the exams; this may make the exception go away. If one needs to run the exams then make sure Java Cryptography Extension is set up.
Question eleven. How Do I Add A New External Dependency?
Say I want to add oozie-core-four.2.Zero.Jar as a dependency to the gobblin-scheduler subproject. I might first open the document build.Gradle and add the subsequent entry to the ext.ExternalDependency array: "oozieCore": "org.Apache.Oozie:oozie-center:4.2.0".
Then in the gobblin-scheduler/construct.Gradle report I could upload the subsequent line to the dependency block: compile externalDependency.OozieCore.
Apache Spark Interview Questions
Question 12. How Do I Add A New Maven Repository To Pull Artifacts From?
Often times, one may additionally have critical artifacts saved in a nearby or personal Maven repository. As of 01/21/2016 Gobblin handiest pulls artifacts from the following Maven Repositories: Maven Central, Conjars, and Cloudera.
In order to feature some other Maven Repository regulate the defaultEnvironment.Gradle record and the brand new repository the usage of the equal pattern as the present ones.
Core Java Interview Questions