Sometimes it does matter what people think of you!

Somewhere read “What others think of you is none of your business!”. True , agreed. But sometimes it does matter.

In my opinion it is a very good way of looking at yourselves from others eyes. It can be a painful or it cab be feel good experience.

Today was my last day at a organization where I was working until today. Its hardly a year I was working here, but enjoyed it a lot. I was not aware that unknowingly I have made so many friends…I have made emotional bonding with so many colleagues. I came to know when I dropped a goodbye mail. So many overwhelming responses. Felt so good……. It really made me feel so good……

 

Telepathy….yes it exist!

Some time we go through such experience that we are thinking about some person and that person suddenly visit you or ring you. Yes it happens….it is not coincidence but it something called telepathy.

Telepathy (from the Ancient Greek τῆλε, tele meaning “distant” and πάθος, pathos or -patheia meaning “feeling, perception, passion, affliction, experience“) is the purported transmission of information from one person to another without using any of our known sensory channels or physical interaction.- Source Wikkipedia.

I have experienced it many time. There is one of my friend, well wisher…whatever you name it. Whenever I think about her I get a phone call from her. This has happened many times. I don’t know what is our bonding but whenever I strongly feel like taking to her or I just remember her , there is always a phone call from her.

There are other experiences as well. Like I am thinking about some one and I just cross that person. This has also happened many times with this persons for whom I have strong boding, feelings.

It also makes me think and believe about the concept of “Oneness”.

Spring Batch- Code example

In last article of Spring Batch we gone through concepts and interfaces provided by spring batch framework.

In this article let’s see how we can use those concepts to create a batch job.

For simplicity, below is a simple batch job we will be taking as an example.

  • Read from a csv file.
  • Process data – select only records where age > 30
  • Write to another csv file.
  1.  configuration file snippet for this job

<batch:job id=“reportJob”>

<batch:listeners>

<batch:listener ref=“customJobListener” />

</batch:listeners>

<batch:step id=“step1”>

<tasklet>

<chunk reader=“csvFileItemReader” writer=“cvsFileItemWriter” processor=“filterCSVProcessor”commit-interval=“1”>

<listeners>

<listener ref=“customStepListener” />

<listener ref=“customItemReaderListener” />

<listener ref=“customItemWriterListener” />

<listener ref=“customItemProcessListener” />

</listeners>

</chunk>

</tasklet>

</batch:step>

</batch:job>

In this configuration file you first define a job with job id. Then you define any listeners which pprovide call-backs at specific points in the lifecycle of a Job like before or after start of job.

Then you define a step which also has unique ID. Then you define the steps which are required for performing your job. These steps are defined in terms of item reader/processor/writers, and whole as a unit are defined with chunk element.

  • reader – The ItemReader that provides items for processing.
  • writer – The ItemWriter that processes the items provided by the ItemReader.
  • commit-interval – The number of items that will be processed before the transaction is committed.

Then you define listeners for step which are again call-backs at specific points in the lifecycle of a step like before or after start of step or before and after reading item etc.

The chunk element is defined within tasklet tag.

Below is a code snippet for item reader listener.

public class CustomItemReaderListener implements ItemReadListener<User> {

public void afterRead(User arg0) {

System.out.println(“CustomItemReaderListener : ” +”afterRead()”);

}

public void beforeRead() {

System.out.println(“CustomItemReaderListener : ” +”beforeRead()”);

}

public void onReadError(Exception arg0) {

System.out.println(“CustomItemReaderListener : ” +”onReadError()”);

}

}

Similarly you define other listeners for job, step, reader/writer/processor etc which are implementations of respective listeners interfaces provided by spring batch framework.

 

  1. Then we need to launch this job. Here is a code snippet.

String[] springConfig = {

“spring/batch/config/context.xml”,

“spring/batch/jobs/job-report.xml”

};

ApplicationContext context =

new ClassPathXmlApplicationContext(springConfig);

JobLauncher jobLauncher = (JobLauncher) context.getBean(“jobLauncher”);

Job job = (Job) context.getBean(“reportJob”);

try {

JobExecution execution = jobLauncher.run(job, new JobParameters());

System.out.println(“Job Exit Status : ” + execution.getStatus());

} catch (Exception e) {

e.printStackTrace();

}

System.out.println(“Done with batch”);

First you create a JobLauncher instance from “jobLauncher” bean defined in context.xml

Then you create a Job instance from “reportJob” which is defined in job configuration file.

When you run your Job instance with help of JobLauncher you will get a JobExecution instance which provides you the status whether your job executed successfully or not.

 

  1. Now let’s see how reader and writers are configured in our job-report.xml

<bean id=”csvFileItemReader” class=”org.springframework.batch.item.file.FlatFileItemReader”>

<!– Read a csv file –>

<property name=”resource” value=”file:csv/input/read.csv” />

<property name=”lineMapper”>

<bean class=”org.springframework.batch.item.file.mapping.DefaultLineMapper”>

<!– split it –>

<property name=”lineTokenizer”>

<bean                                                                                           class=”org.springframework.batch.item.file.transform.DelimitedLineTokenizer”>

<property name=”names” value=”name,age,phone” />

</bean>

</property>

<property name=”fieldSetMapper”>

<!– map to an object –>

<bean                                                                                              class=”org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper”>

<property name=”prototypeBeanName” value=”user” />

</bean>

</property>

</bean>

</property>

</bean>

<bean id=”cvsFileItemWriter” class=”org.springframework.batch.item.file.FlatFileItemWriter”>

<!– write to this csv file –>

<property name=”resource” value=”file:csv/output/write.csv” />

<property name=”shouldDeleteIfExists” value=”true” />

<property name=”lineAggregator”>

<bean                                                               class=”org.springframework.batch.item.file.transform.DelimitedLineAggregator”>

<property name=”delimiter” value=”,” />

<property name=”fieldExtractor”>

<bean                                     class=”org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor”>

<property name=”names” value=”name,age,phone” />

</bean>

</property>

</bean>

</property>

</bean>

</beans>

You first define beans which are responsible for reading and writing. In this example we are using readers and writers provided by spring batch framework i.e. FlatFileItemReader and FlatFileItemWriter as we are using comma separated records. You can have your custom classes as well.

Then you define a csv file to be read or written using property resource.

  1. Reading

We have defined a line maper . Line mappers are used for tokenizing lines into FieldSet in our case domain class User , followed by mapping to items.

  1. Writing:

We have defined a line aggregator that converts an object into a delimited list of strings. The default delimiter is a comma.

Then we need to extract our fields from our domain object user. For this we use field extractor to which given an array of property names, it will reflectively call getters on the item and return an array of all the values.

  1. Processing:

We have our custom file processor as we want to filter out records where age > 30.

For this we defined a bean

<bean id=“filterCSVProcessor” class=“com.springbatch.processor.FilterCSVProcessor” />

And here is implementation

public class FilterCSVProcessor implements ItemProcessor<User, User>{

public User process(User user) throws Exception {

if( user.getAge() > 30)

return user;

else

return null;

}

}

That’s all. Complete source code can be found here.

 

 

 

 

Spring Batch – Concepts and interfaces

Recently I came across a very interesting incident which was related to spring batch, where a write skip count was not getting updated properly. I never worked on spring batch and almost knew nothing about it before working on this incident.

So after understanding what the incident is about, I started reading how spring batch works.

Let’s see what spring batch is and how it works.

In general terms “BATCH” is the execution of a series of programs on a computer without manual intervention.

Where batch processing can be used:

  • Data Export
  • Invoice generation
  • Bulk database updates
  • Automated transaction processing
  • Processing digital images

What is Spring Batch: Spring Batch is an open source framework for batch processing. It is a lightweight framework based on top of spring framework.

Features:

  • Transaction management
  • Chunk based processing
  • Start/Stop/Restart
  • Retry/Skip

Let’s first understand the terms which are core to Spring Batch framework.

Batch: execution of a series of jobs

Job: A sequence of one or more steps and associated configuration that belong to the batch job. A job is indented to be executed without interruption.

JobInstance : A uniquely identifiable job run.

JobExecution : A single attempt to run a job. A JobInstance will be considered complete only when JobExecution completes successfully.

Step : A Step is a part of a Job and contains all the necessary information to execute the batch processing actions that are expected to be done at that phase of the job. A Step is a single state within the flow of a job.

StepExecution: An attempt to execute a step. Contains information about commit count and access to the Execution Context.

Job Repositories: Job repositories provides CRUD persistence operations for all job related metadata like the results obtained, their instances, the parameters used for the Jobs executed and the context where the processing runs.

JobLauncher: Responsible for launching jobs with their job parameters.

Batch application can be divided in three main parts:

  • Reading the data (from a database, file system, etc.)
  • Processing the data (filtering, grouping, calculating, validating…)
  • Writing the data (to a database, reporting, distributing…)

There are various reader and writer interfaces provided by spring batch framework.

Key Interfaces are:

  • ItemReader :
  • ItemWriter
  • ItemProcessor

ItermReader : Readers are abstractions responsible of the data retrieval. Here is a list of readers

  • AmqpItemReader
  • AggregateItemReader
  • FlatFileItemReader
  • HibernateCursorItemReader
  • HibernatePagingItemReader
  • IbatisPagingItemReader
  • ItemReaderAdapter
  • JdbcCursorItemReader
  • JdbcPagingItemReader
  • JmsItemReader
  • JpaPagingItemReader
  • ListItemReader
  • MongoItemReader
  • Neo4jItemReader
  • RepositoryItemReader
  • StoredProcedureItemReader
  • StaxEventItemReader

ItemWriter: Writers are abstractions responsible of writing the data to the desired output database or system. Here is a list of writers

  • AbstractItemStreamItemWriter
  • AmqpItemWriter
  • CompositeItemWriter
  • FlatFileItemWriter
  • GemfireItemWriter
  • HibernateItemWriter
  • IbatisBatchItemWriter
  • ItemWriterAdapter
  • JdbcBatchItemWriter
  • JmsItemWriter
  • JpaItemWriter
  • MimeMessageItemWriter
  • MongoItemWriter
  • Neo4jItemWriter
  • StaxEventItemWriter
  • RepositoryItemWriter

ItemProcessor: Processors are in responsible for modifying the data records converting it from the input format to the output desired one. These are optional. Here is a list

  • ValidatingItemProcessor
  • PassThroughItemProcessor
  • ScriptItemProcessor

And many other.

To put it all together this is how it looks

SpringBatch

 

Ok. So now how these batch jobs are processed. There are two ways.

1.Chunk oriented processing:

Chunk oriented processing refers to reading the data one at a time, and creating ‘chunks’ that will be written out, within a transaction boundary. One item is read in from an ItemReader, handed to an ItemProcessor, and aggregated. Once the number of items read equals the commit interval, the entire chunk is written out via the ItemWriter, and then the transaction is committed.

chunk-oriented-processing

Configuring step for chunk oriented processing:

<job id="sampleJob" job-repository="jobRepository">
    <step id="step1">
        <tasklet transaction-manager="transactionManager">
            <chunk reader="itemReader" writer="itemWriter" commit-interval="10"/>
        </tasklet>
    </step>
</job>

 

 

2.Tasklet oriented processing: Sometimes step consists of simple operations consisting of a single task like simple stored procedure call or deleting a file etc.

For such case Tasklet interface is provided.

The Tasklet is a simple interface that has one method, execute, which will be a called repeatedly by the TaskletStep until it either returns RepeatStatus.FINISHED or throws an exception to signal a failure.

Configuration of step as a tasklet:

<step id=”step1″>

<tasklet ref=”myTasklet”/>

</step>       

 

In next article we will see how these concepts are used /applied to create a batch job application.