Integrate Jenkins with GIT SCM

TO configure Git in Jenkins, first log in to your Jenkins server and in your Dashboard left side there is an option "Manage Jenkins", click on it

Now click on "Manage Plugins" on the next screen.

In the next screen click on "Available" tab.

You get a list of Plugins, in this list there so many Plugin, in 'Filter' box we put 'Git Plugin'.

Now we select the 'Git Plugin' check-box  and press the "Install without restart"

you get the following screen

Once all installations are complete, restart the Jenkins server by selecting ""  at the bottom of the page.

Now login again to Jenkins dashboard.

After Jenkins is restarted, Git will be available as an option whilst configuring jobs. To verify, click on New Item in the menu options for Jenkins. Then enter a name for a job, in the following case, the name entered is ‘JenkisDemo’. Select ‘Freestyle project’ as the item type. Click the Ok button.

In the next screen when you click on "Source Code Management" tab you get GIT as an option

Jenkins continous Integration tool

What is Continuous Integration ?

Continuous Integration (CI) is a development practice that requires developers to integrate code into a shared repository several times a day. Each check-in is then verified by an automated build, allowing teams to detect problems early.

Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly. This article is a quick overview of Continuous Integration summarizing the technique and its current usage. 

The main aim of CI is to prevent integration problems, referred to as "integration hell"

A continuous integration server acts as a monitor to the repository. Every time a commit against the repository finishes the server automatically checks out the sources onto the integration machine, initiates a build, and notifies the committer of the result of the build. The committer isn't done until she gets the notification - usually an email.

What is Jenkins ? 

Jenkins is a popular open source tool to perform continuous integration and build automation. The basic functionality of Jenkins is to execute a predefined list of steps, e.g. to compile Java source code and build a JAR from the resulting classes. The trigger for this execution can be time or event based. For example, every 20 minutes or after a new commit in a Git repository.

Environment Set up for Jenkins ?

Jenkins can be installed through native system packages, Docker, or even run standalone by any machine with the Java Runtime Environment installed.

1. In this tutorial we use standalone distribution. TO download the Jenkis click on the given Jenkins link.

2. Go to the directory where you download the Jenkins war file and run the following command-

$ java -jar jenkins.war

After the command is run, various tasks will run, one of which is the extraction of the war file which is done by an embedded webserver called winstone.

Running from: /home/expert/Downloads/jenkins.war
webroot: $user.home/.jenkins
Jul 24, 2017 10:18:48 PM Main deleteWinstoneTempContents

WARNING: Failed to delete the temporary Winstone file /tmp/winstone/jenkins.war
Jul 24, 2017 10:18:48 PM org.eclipse.jetty.util.log.JavaUtilLog info
INFO: Logging initialized @454ms
Jul 24, 2017 10:18:48 PM winstone.Logger logInternal
INFO: Beginning extraction from war file

3. Go to browser and type http://localhost:8080/
an authentication window is opened and it will ask you to enter the user and password.

type user as admin and password for this you get on server machine in the location ${user.home}/.jenkins/secrets/initialAdminPassword

after succesfull authentication you will get the Jenkins dashboard


Convention for Versioning of your project build

As we know that when we use any software tool or dependency or OS, they all come up with vesrion number like 2.1.1.

As a Software Engineer, developer and programmer we must understood what these version numbers.

So In this tutorial I will explain you about these version number.

The common convention for version numbers is

major. is incremented when something major is changed in your project.  For example, suppose you have removed a functionality, or changes the  signature of a function. so if client uses new version of project, changes can break the project of Clients using your library, so your client  need to take care when using a library with a different major version.

minor is incremented when something new added to your project but all the old functionality is same, For example, a method is added. In this case your client doe not worry about using the new version.Clients do not need to worry about about using the new version, as all the functions they are used to seeing will still be there and act the same.

build is incremented when the implementation of a function changes, but no signatures are added or removed. For example, you found a bug and fixed it. Clients should probably update to the new version, but if it doesn't work because they depended on the broken behavior, they can easily downgrade.

What Browser do when you type a address in your Browser Addres bar

As a end user, we do not have to know about what the browser do,  when we type an address in Browser.

But as a Programmer or Software Engineer some time we involve in web programming. As we know that web programming include HTTP, HTML, CSS, web server and so on.

Mostly novice programmer have an abstract idea about, what is happening behind the scene. In this tutorial I will try to take all of you into a deeper picture of behind the scene.

Suppose we are going to read a tutorial on techie-knowledge

1. We type the interested address(URL) in the Browser address bar.


2.  As we know the address which we have type is known as domain-name. Internet works on IP address.
     so it is clear that domain name converted to IP  address.

    a. So after typing the URL in browser, browser first extract the domain name from the URL.
    b. then browser queries to your pre-configure DNS server to find the IP address of the domain.Some               time it may be happen that DNS server have not the IP for domain, In this case DNS server will                    forward the query along to  DNS server it is configured to defer to.

     c. After getting IP address browser sends a HTTP request  original site

       d. After getting the response browser render the page to browser.

MAVEN archetype

MAVEN archetype is predefined template for specific project type like web project, spring project,  provided by Maven. In Maven a template is called an archetype.

As we know that when we create a project in eclipse it ask us to select the type of project like simple java project, dynamic web project ... when we select the type of project a structure of the project is created by eclipse.

similarly maven provides different archetype to create project.

Using an Archetype :

To create a new project based on an Archetype, you need to call mvn archetype:generate goal, like the following:

$ mvn archetype:generate
when you run this command maven download the diffrent archetype provided by maven, and ask you to select the archetype for your project.

sample output of the command

[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Building Maven Stub Project (No POM) 1
[INFO] ------------------------------------------------------------------------ [INFO]
[INFO] maven-archetype-plugin:3.0.1:generate (default-cli) > generate-sources @ standalone-pom >>>
[INFO] [INFO] <<< maven-archetype-plugin:3.0.1:generate (default-cli) < generate-sources @ standalone-pom <<<
[INFO] --- maven-archetype-plugin:3.0.1:generate (default-cli) @ standalone-pom ---
[INFO] Generating project in Interactive mode
[INFO] No archetype defined. Using maven-archetype-quickstart (org.apache.maven.archetypes:maven-archetype-quickstart:1.0) Choose archetype:
1: remote -> am.ik.archetype:maven-reactjs-blank-archetype (Blank Project for React.js)
2: remote -> am.ik.archetype:msgpack-rpc-jersey-blank-archetype (Blank Project for Spring Boot + Jersey)
3: remote -> am.ik.archetype:mvc-1.0-blank-archetype (MVC 1.0 Blank Project)
4: remote -> am.ik.archetype:spring-boot-blank-archetype (Blank Project for Spring Boot)
5: remote -> am.ik.archetype:spring-boot-docker-blank-archetype (Docker Blank Project for Spring Boot)
6: remote -> am.ik.archetype:spring-boot-gae-blank-archetype (GAE Blank Project for Spring Boot)
7: remote -> am.ik.archetype:spring-boot-jersey-blank-archetype (Blank Project for Spring Boot + Jersey)
8: remote -> at.chrl.archetypes:chrl-spring-sample (Archetype for Spring Vaadin Webapps)
9: remote -> (an archetype web 3.0 + struts2 (bootstrap + jquery) + JPA 2.1 with struts2 login system)
10: remote -> (An Archetype with JPA 2.1; Struts2 core; Jquery struts plugin; Struts BootStrap plugin; json Struts plugin; Login System using Session and Interceptor)
11: remote -> (Anteros Archetype for Java Web projects.)
12: remote -> (Modelos com Anotações Gson)
Choose a number or apply filter (format: [groupId:]artifactId, case sensitive contains): 1007:
Press Enter to choose to default option or chosse the number of archetype. Enter project detail as asked.Press Enter if default value is provided. You can override them by entering your own value.
Choose org.apache.maven.archetypes:maven-archetype-quickstart version:
1: 1.0-alpha-1
2: 1.0-alpha-2
3: 1.0-alpha-3
4: 1.0-alpha-4
5: 1.0
6: 1.1
Choose a number: 6:
Define value for property 'groupId':
Define value for property 'artifactId': mvntest
Define value for property 'version' 1.0-SNAPSHOT: :
Define value for property 'package' :

Maven will ask for project detail confirmation. Press enter or press Y
Confirm properties configuration:
artifactId: mvntest
version: 1.0-SNAPSHOT
package: com.esc.mvntest
Y: :
Now Maven will start creating project structure and will display the following:
[INFO] ----------------------------------------------------------------------------
[INFO] Using following parameters for creating project from Old (1.x) Archetype: maven-archetype-quickstart:1.1
[INFO] ---------------------------------------------------------------------------- [INFO] Parameter: basedir, Value: /home/expert
[INFO] Parameter: package, Value: com.esc.mvntest
[INFO] Parameter: groupId, Value:
[INFO] Parameter: artifactId, Value: mvntest
[INFO] Parameter: packageName, Value: [INFO] Parameter: version, Value: 1.0-SNAPSHOT [INFO] project created from Old (1.x) Archetype in dir: /home/expert/mvntest
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 06:26 min
[INFO] Finished at: 2017-07-19T18:10:36+05:30
[INFO] Final Memory: 18M/199M
[INFO] ------------------------------------------------------------------------
You'll see a java application project created named mvntest which was given as artifactId at the time of project creation. Maven will create a standard directory layout for the project as shown below:
├── pom.xml
└── src
        ├── main
        │        └── java
        │              └── com
        │                       └── xyz
        │                               └── mvntest
        │                                              └──
        └── test
               └── java
                      └── com
                             └── xyz
                                     └── mvntest
Maven generates a POM.xml file for the project as listed below:
<project xmlns="" xmlns:xsi=""




</project> file which was created by maven 

package com.esc.mvntest;

 * Hello world!
public class App
    public static void main( String[] args )
        System.out.println( "Hello World!" );
Compiling code

To compile the code we use the following command
$ mvn compile
The sample output of the above commmand
[INFO] Scanning for projects... [INFO] [INFO] ------------------------------------------------------------------------
[INFO] Building mvntest 1.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ mvntest ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/expert/mvntest/src/main/resources
[INFO] [INFO] --- maven-compiler-plugin:3.2:compile (default-compile) @ mvntest ---
[INFO] Nothing to compile - all classes are up to date
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4.150 s
[INFO] Finished at: 2017-07-22T11:04:23+05:30
[INFO] Final Memory: 7M/121M
[INFO] ------------------------ ------------------------------------------------

Packaging the code

As we know that this project is just a simple java program so we can create the jar file of this project. To create the jar file of the project we use the following command

The default naming convention of Maven artifacts is: {artifact-name}-{artifact-version}

$ mvn package

Sample output of the following command

[INFO] Scanning for projects...
[INFO] [INFO] ------------------------------------------------------------------------
[INFO] Building mvntest 1.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ mvntest ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/expert/mvntest/src/main/resources
[INFO] [INFO] --- maven-compiler-plugin:3.2:compile (default-compile) @ mvntest ---
[INFO] Nothing to compile - all classes are up to date
[INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ mvntest ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/expert/mvntest/src/test/resources
[INFO] [INFO] --- maven-compiler-plugin:3.2:testCompile (default-testCompile) @ mvntest ---
[INFO] Nothing to compile - all classes are up to date
[INFO] [INFO] --- maven-surefire-plugin:2.17:test (default-test) @ mvntest ---
[INFO] Surefire report directory: /home/expert/mvntest/target/surefire-reports
Running com.esc.mvntest.AppTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec - in com.esc.mvntest.AppTest
Results :
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ mvntest ---
[INFO] Building jar: /home/expert/mvntest/target/mvntest-1.0-SNAPSHOT.jar [INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.885 s
[INFO] Finished at: 2017-07-22T11:28:38+05:30
[INFO] Final Memory: 16M/158M
[INFO] ------------------------------------------------------------------------


MAVEN build phase

Maven build life cycle 

The build life cycle is divided into build phases, and the build phases are divided into build goals.
Every build follows a specified life cycle. Maven comes with a default life cycle that includes the most common build phases like compiling, testing and packaging. 

Build Life Cycles
Maven has 3 built-in build life cycles. These are:
  1. default (build) : Used to create the application
  2. clean : Cleans up artifacts that are created by prior builds
  3. site : For the project generates site documentation
Build phase : Each build life cycle is divided into a sequence of build phases, and the build phases are again subdivided into goals. Thus, the total build process is a sequence of build life cycle(s), build phases and goals.

The following lists gives an overview of the important Maven life cycle phases.
  • validate - checks if the project is correct and all information is available
  • compile - compiles source code in binary artifacts
  • test - executes the tests
  • package - takes the compiled code and package it, for example into a JAR file.
  • integration-test - takes the packaged result and executes additional tests, which require the packaging
  • verify - performs checks if the package is valid
  • install - install the result of the package phase into the local Maven repository
  • deploy - deploys the package to a target, i.e. remote repository 
we can execute one of these build phases by passing its name to the mvn command. Here is an example:

$ mvn compile

This example executes the compile build phase, and thus also all build phases before it in Maven's predefined build phase sequence.

Note : Calling a build phase will execute not only that build phase, but also every build phase prior to the called build phase. 

Given the build phases above, when the default lifecycle is used, Maven will
  1. validate the project
  2. compile the sources
  3. run those against the tests
  4. package the binaries (e.g. jar)
  5. run integration tests against that package
  6. verify the package
  7. install the verifed package to the local repository
  8. deploy the installed package in a specified environment
  To do all those, you only need to call the last build phase to be executed, in this case, deploy:
mvn deploy

The following lists all build phases of the default, clean and site lifecycles, which are executed in the order given up to the point of the one specified.

Clean Lifecycle

pre-clean execute processes needed prior to the actual project cleaning
clean remove all files generated by the previous build
post-clean execute processes needed to finalize the project cleaning

Default Lifecycle

validate validate the project is correct and all necessary information is available.
initialize initialize build state, e.g. set properties or create directories.
generate-sources generate any source code for inclusion in compilation.
process-sources process the source code, for example to filter any values.
generate-resources generate resources for inclusion in the package.
process-resources copy and process the resources into the destination directory, ready for packaging.
compile compile the source code of the project.
process-classes post-process the generated files from compilation, for example to do bytecode enhancement on Java classes.
generate-test-sources generate any test source code for inclusion in compilation.
process-test-sources process the test source code, for example to filter any values.
generate-test-resources create resources for testing.
process-test-resources copy and process the resources into the test destination directory.
test-compile compile the test source code into the test destination directory
process-test-classes post-process the generated files from test compilation, for example to do bytecode enhancement on Java classes. For Maven 2.0.5 and above.
test run tests using a suitable unit testing framework. These tests should not require the code be packaged or deployed.
prepare-package perform any operations necessary to prepare a package before the actual packaging. This often results in an unpacked, processed version of the package. (Maven 2.1 and above)
package take the compiled code and package it in its distributable format, such as a JAR.
pre-integration-test perform actions required before integration tests are executed. This may involve things such as setting up the required environment.
integration-test process and deploy the package if necessary into an environment where integration tests can be run.
post-integration-test perform actions required after integration tests have been executed. This may including cleaning up the environment.
verify run any checks to verify the package is valid and meets quality criteria.
install install the package into the local repository, for use as a dependency in other projects locally.
deploy done in an integration or release environment, copies the final package to the remote repository for sharing with other developers and projects.

Site Lifecycle

pre-site execute processes needed prior to the actual project site generation
site generate the project's site documentation
post-site execute processes needed to finalize the site generation, and to prepare for site deployment
site-deploy deploy the generated site documentation to the specified web server


Maven Repository

Maven repository is a place where we find or put the jar files, plug-ins. Maven use these repository to locate the dependencies of the project which we define in pom.xml file.

Maven uses three types of repository -

  • Local
  • Centeral
  • Remote
the order of serching the dependency by maven is given below -

local -- > centeral --> remote

first it search on local repository and then central repository  and then remote repositroy. if the maven is not find dependencies in these repository it throws error

Local Repository :  When you install and run maven first time, it will create a .m2 directory on your home directory, which contains a another directory name repository like-


This the default location for the jar which maven check. If the particular jar is not in local repository then it will be downloaded from the remote repository which is set by maven when we installed the maven.

Central Repository : Maven central repository is repository provided by Maven community. It contains a large number of commonly used libraries.

When Maven does not find any dependency in local repository, it starts searching in central repository using following URL:

Remote Repository :  which is developer's own custom repository containing required libraries or other project jars.

Maven POM

Pom File: 

POM is stands for "Project Object Model" is an xml file which contains information about the project which you want to build and configuration details used by MAVEN to build the project.

For example it define the directory structure for your source file and your class file. and also define the location where dependencies related to your project is stored.

POM also contains the goals and plugins. While executing a task or goal, Maven looks for the pom.xml in the current directory.

Some of the configuration that can be specified in the POM are following:
  • project dependencies
  • plugins
  • goals
  • build profiles
  • project version
  • developers
  • mailing list

Sample Pom

There should be a single POM file for each project.

 * project – root element of every pom file.

* modelVersion – The modelVersion element sets what version of the POM        model you are using

* groupId – The groupId element is a unique ID for an organization, or a project

* artifactId – The artifactId element contains the name of the project you are building

* version – The version element contains the version number of the project.

The above groupId, artifactId and version elements would result in a JAR file being built and put into the local Maven repository.

Super POM :

All POMs inherit from a parent (despite explicitly defined or not). This base POM is known as the Super POM, and contains values inherited by default.  

All Maven project POMs extend the Super POM, which defines a set of defaults shared by all projects.

Effective POM :

Maven use the effective pom (super pom + pom.xml ) to execute relevant goal.



Defining Dependency on POM file for your project : sample pom for defining dependencies


Maven Installation In Ubuntu

How to Install Maven ?


JAVA 1.5 or above. 

Follow the given link to see how to install java and set environment for java in ubuntu.

Installation of Maven on ubuntu can be pretty straightforward 

Type the below commands on terminal.

$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get install maven

It takes few minutes to download, patient.


Type “mvn -version” to verify the installation.
$ mvn -version
Output should something like this-

Apache Maven 3.3.9
Maven home: /usr/share/maven
Java version: 1.8.0_131, vendor: Oracle Corporation
Java home: /opt/jdk1.8.0_131/jre
Default locale: en_IN, platform encoding: UTF-8
OS name: "linux", version: "4.4.0-83-generic", arch: "amd64", family: "unix"

Where the Maven installed?

The apt-get installation will install all the required files in the following folder structure-

  1. /usr/bin/mvn
  2. /usr/share/maven/
  3. /etc/maven   (this is the maven configuration location)

Using maven behind proxy

To use maven behind the proxy we have to define the proxy setting in setting.xml  file inside the proxy element.
The setting file is found in /etc/maven/setting.xml
Add the following line under proxy element

What is Maven ?

What is Maven ?

Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of Information.

apche maven

I used the maven to compile and package my storm topology to run on storm cluster both in production mode and local mode.

Maven provides developers ways to manage the following:
  • Builds
  • Documentation
  • Reporting
  • Dependencies
  • SCMs
  • Releases
  • Distribution
  • mailing list

Convention over Configuration

Maven uses Convention over Configuration which means developers are not required to create build process themselves. Developers do not have to mention each and every configuration details.

Developers don't need to say every last design detail. Expert gives sensible default conduct to ventures. At the point when a Maven extend is made, Maven makes default extend structure. Developer is just required to put records in like manner and he/she require not to characterize any design in pom.xml.

Default structure provided by Maven for a project

Feature Summary

The following are the key features of Maven in a nutshell:

  • Simple project setup that follows best practices - get a new project or module started in seconds
  • Consistent usage across all projects - means no ramp up time for new developers coming onto a project
  • Superior dependency management including automatic updating, dependency closures (also known as transitive dependencies)
  • Able to easily work with multiple projects at the same time
  • Model based builds: Maven is able to build any number of projects into predefined output types such as a JAR, WAR, or distribution based on metadata about the project, without the need to do any scripting in most cases.
  • Coherent site of project information: Using the same metadata as for the build process, Maven is able to generate a web site or PDF including any documentation you care to add, and adds to that standard reports about the state of development of the project. Examples of this information can be seen at the bottom of the left-hand navigation of this site under the "Project Information" and "Project Reports" submenus.
  • Release management and distribution publication: Without much additional configuration, Maven will integrate with your source control system (such as Subversion or Git) and manage the release of a project based on a certain tag. It can also publish this to a distribution location for use by other projects. Maven is able to publish individual outputs such as a JAR, an archive including other dependencies and documentation, or as a source distribution.
  • Dependency management: Maven encourages the use of a central repository of JARs and other dependencies. Maven comes with a mechanism that your project's clients can use to download any JARs required for building your project from a central JAR repository much like Perl's CPAN. This allows users of Maven to reuse JARs across projects and encourages communication between projects to ensure that backward compatibility issues are dealt with.
 Maven Installation In Ubuntu

References :

Installing And Using GIZA++ in Ubuntu for Word Alignment

What is GIZA++ ?

 GIZA++ is an extension of the program GIZA (part of the SMT toolkit EGYPT) which was developed by the Statistical Machine Translation team during the summer workshop in 1999 at the Center for Language and Speech Processing at Johns-Hopkins University (CLSP/JHU). GIZA++ includes a lot of additional features. The extensions of GIZA++ were designed and written by Franz Josef Och.

 What is parallel corpus ?

A parallel corpus is a collection of texts, each of which is translated into one or more other languages than the original.

The simplest case is where two languages only are involved: one of the corpora is an exact translation of the other. Some parallel corpora, however, exist in several languages. 

Installing GIZA++

Step 1- Download Giza++ using following command:

    $ wget

Step 2-  Make Folder for your GIZA++ installation

    $ mkdir giza-practice

Step 3-  $ mv giza-practice/

Step 4-  $ cd giza-practice/

Step 5- $ unzip

Step 6- cd giza-pp-master/

Step 7- make clean

Step 8- make

Creating Parrel Corpus to Use in GIZA++


As we know that GIZA++ is tool for word alignment, it uses parallel corpus for creating dictionary.

In this example we use two language English as Source Language and Hindi as Target Language

Step 1. So First we create a file called hindi.txt and copy the below hindi text in this file.

मैंने उसे किताब दी .
मैंने किताब को पढ़ा .
वह किताब को प्यार करता था .
उसने किताब दी .

Step 2. Now we create a file called english.txt and copy the below english text in this file.

I gave him the book .
I read the book .
He loved the book .
He gave the book .

Now our parallel corpus is created.

 Running GIZA++

Step 1. Copy hindi.txt and english.txt  files to giza-pp-master/GIZA++-v2/

Step 2. cd giza-pp-master/GIZA++-v2/

Step 3. use following command to convert your corpus into GIZA++ format:

    ./plain2snt.out [source_language_corpus] [target_language_corpus]

    $ ./plain2snt.out english.txt hindi.txt

Step 4.  Type following commands for Making class and cooccurrence:

  $ ./../mkcls-v2/mkcls -p[source_language_corpus]   -V[source_language_corpus].vcb.classes

    $ ./../mkcls-v2/mkcls -p[target_language_corpus] -V[target_language_corpus].vcb.classes

Example $./../mkcls-v2/mkcls -penglish.txt -Venglish.txt.vcb.classes
    $./../mkcls-v2/mkcls -phindi.txt -Vhindi.txt.vcb.classes

Step 5. create output directory using command $ mkdir myout

Step 6. Now use GIZA++ to build your dictionary

./GIZA++ -S [target_language_corpus].vcb -T [source_language_corpus].vcb -C [target_language_corpus]_[source_language_corpus].snt -o [prefix] -outputpath [output_folder]

Ex. : $./GIZA++ -S hindi.vcb -T english.vcb -C hindi_english.snt -outputpath myout -o test

Note if you get an error please update the Makefile inside GIZA++-v2



It will generate the output files in myout/ directory
and out of the variuos files file with name [prefix] (file in our case) will be the final file.

It contains the alignment of source and target words according to their probability value:

book NULL 1
. को 0.333333
gave दी 1
He था 0.333333
him उसे 1
loved प्यार 0.5
read पढ़ा 1
the . 1
He उसने 0.333333
. किताब 0.666667
loved करता 0.5
I मैंने 1
He वह 0.333333


hashCode() and equals() Methode in java

As a Java programmer we know that java.lang.Object is the base class of every class in Java language.

Object class provide some method that provide some default implementation.

Since object class is the base class then  method defined by Object class are also available to every class defined in Java, but some time the default implementation of these method is not appropriate for new User Defined classes.

Here we discuses two most important method of object class.

The method Define in object class

      public boolean equals(Object obj)

       .Indicates whether some other object is "equal to" this one.

      The default implementation of equal method compares two objects for equality and returns true if they are equal.

This method only check weather the references of object point to same object or not. means it checks for references not value.

     public int hashCode()

      .Returns a hash code value for the object.

      The value returned by hashCode() is the object's hash code, which is the object's memory address in hexadecimal.

Contract between equal() and hashCode()

 1. If two objects are equal, their hash code must also be equal.
 2. If you override the equals() method, you must also override the hashCode() method as well.

Some time we do not want to use default implementation of equals() method in our own define class so we must override this method in our class.

Ex. Suppose we have a class Student and we want to compare weather two student are equal or not base on the instance variable studentId.

then we have to override the equal() method to meet our requirement

The equals method implements an equivalence relation. It is:

Reflexive: For any non-null reference value x , x.equals(x) must return true .

Symmetric: For any non-null reference values x and y , x.equals(y) must re-
turn true if and only if y.equals(x) returns true .

Transitive: For any non-null reference values x , y , z , if x.equals(y) returns
true and y.equals(z) returns true , then x.equals(z) must return true .

Consistent: For any non-null reference values x and y , multiple invocations
of x.equals(y) consistently return true or consistently return false , pro-
vided no information used in equals comparisons on the objects is modified.

• For any non-null reference value x , x.equals(null) must return false .

Here we provide an example how to override equal() and hashCode()


public class Movie {

 String movieName;
 int price;

 public Movie(String movieName, int price) {

  this.movieName = movieName;
  this.price = price;

 public String toString() {
  return "Movie name is " + movieName + " And price is "
    + price;


  * here we want if the movieName of the two Movie oject is same then both
  * Movie object is equal
 public boolean equals(Object o) {

  if (o == this)
   return true;
  if (o == null)
   return false;
  if (!(this.getClass().equals(o.getClass())))
   return false;
  Movie movie = (Movie) o;
  return (this.movieName.equals(movie.movieName)) ? true : false;


 public int hashCode() {

  return 31 * movieName.hashCode();


Test the above class


public class Test {

 public static void main(String[] args) {
  Movie movie1 = new Movie("The Ghazi Attack", 200);
  Movie movie2 = new Movie("The Ghazi Attack", 300);
   System.out.println("object are equal");
  System.out.println("object not equal");
   System.out.println("object are equal");
  System.out.println("object not equal");


Some Basic point about Map, Set and List from JAVA Collection

A Set is a Collection that cannot contain duplicate elements

three general-purpose Set implementations:

1. HashSet :

    Uses HashTable to store its element.
    Uses Hash Function for Storing and retrieving its element.
    Order is not maintain in HashSet.

2. TreeSet :

   Uses Red-Black tree to store its element.
   Order of elements maintained according to their values.

3. LinkedHashSet (LinkeList + HashSet)

     Implemented as a hash table with a linked list running through it.
     orders its elements based on the order in which they were inserted into the set (insertion-order)

A List is an ordered Collection (sometimes called a sequence). Lists may contain duplicate elements

The Java platform contains two general-purpose List implementations

1. ArrayList :
     Use variable-size array to store element
     element can access randomly using index.
     maintain the elements insertion order

2. LinkedList :

   Doubly-linked list implementation of the List
   Sequential access of elements
   maintain the elements insertion order

Note : LinkedList element deletion is faster compared to ArrayList.

A Map is an object that maps keys to values.
A map cannot contain duplicate keys: Each key can map to at most one value

Java platform contains three general-purpose Map implementations:

1.HashMap :

   Hash table based implementation of the Map interface
   makes no guarantees as to the order of the map; in particular, it does not           guarantee that the order will remain constant over time.

2.TreeMap :
   A Red-Black tree based NavigableMap implementation
   The map is sorted according to the natural ordering of its keys

3.LinkedHashMap :
   Hash table and linked list implementation of the Map interface
   maintain the insertion order

Threads Versus Processes

Threads are a mechanism that permits an application to perform
multiple tasks concurrently. A single process can contain multiple threads.

All of these threads are independently executing the same
program, and they all share the same global memory, including the initialized data, uninitialized data, and heap segments.

some of the factors that might influence our choice of whether to implement an application as a group of threads or as a group of processes.

We begin by considering the advantages of a multithreaded approach:

Sharing data between threads is easy. By contrast, sharing data between processes requires more work (e.g., creating a shared memory segment or using a pipe).

Thread creation is faster than process creation; context-switch time may be
lower for threads than for processes.

Using threads can have some disadvantages compared to using processes:

When programming with threads, we need to ensure that the functions we call
are thread-safe or are called in a thread-safe manner.  Multiprocess applications don’t need to be
concerned with this.

A bug in one thread (e.g., modifying memory via an incorrect pointer) can dam-
age all of the threads in the process, since they share the same address space and other attributes. By contrast, processes are more isolated from one another.

Each thread is competing for use of the finite virtual address space of the host
process. In particular, each thread’s stack and thread-specific data (or thread-
local storage) consumes a part of the process virtual address space, which is
consequently unavailable for other threads.

Basics of Relational Data Model

Edgar Codd proposed Relational Data Model in 1970.

It is a representational or implementation data model.

Using this representational (or implementation) model we represent a database as collection of relations.

The notion of relation here is different from the notion of relationship used in ER modeling.

Relation is the main construct for representing data in relational model.
Every relation consists of a relation schema and Relation instance.

Relation Schema is denoted by  R (A1, A2, A3,……., An),

Customer (Customer ID, Tax ID, Name, Address, City, State, Zip, Phone, Email,Sex)

R--> Relation Name
Ai--> Attributes Name

The number of columns in a relation is known as its degree or arity’.

Relation instance or Relation State (r) of R (thought of as a table)
Each row in the table represents a collection of related data.
Each row contains facts about some entity of same entity-set.

        R = (A1, A2, A3,……., An)
        r(R) is a set of n tuples in R
        r = {t1, t2, t3,…….,tn}

r is an instance of R each t is a tuple and is a ordered list of values.
t = (v1  , v2 ,…, vn ) where vi  is an element of domain of Ai   

Characteristics of a  Relation:

Ordering of tuples  is not significant.

Ordering of values in a tuple is  important.

Values in a tuple under each column must be atomic (simple & single).