Java Guild · 2026
A Practical Introduction for Developers
Part 1
You could write a loop. But then you'd also need to build:
Part 2
Instead of read-all then write-all, data flows in bounded chunks — each chunk is one database transaction.
Returns one item per call.
Returns null when the source is exhausted — no loop needed.
Transforms or validates. Return null to skip an item. Entirely optional.
Receives the whole Chunk<O>. Writes in bulk for efficiency and atomicity.
Every run is automatically stored in a JobRepository — a set of tables in your database.
| Table | Stores |
|---|---|
BATCH_JOB_INSTANCE | Unique combination of job name + parameters |
BATCH_JOB_EXECUTION | Each run: start time, end time, final status |
BATCH_STEP_EXECUTION | Per-step metrics: read / write / skip counts |
BATCH_JOB_EXECUTION_PARAMS | Parameters passed to the job |
Part 3 · Demo 1
CSV → uppercase names → CSV
firstName,lastName
Jill,Doe
Joe,Doe
Justin,Doe
Jane,Doe
John,Doe
firstName,lastName
JILL,DOE
JOE,DOE
JUSTIN,DOE
JANE,DOE
JOHN,DOE
basics/config/BasicsJobConfig.java
@Bean
public FlatFileItemReader<Person> basicsReader() {
return new FlatFileItemReaderBuilder<Person>()
.name("personItemReader")
.resource(new ClassPathResource("basics/input.csv"))
.delimited()
.names("firstName", "lastName") // CSV column names
.targetType(Person.class) // maps to POJO via reflection
.build();
}
Person per call, and returns null when the file is exhausted.
Spring Batch drives the loop — you never write it yourself.
basics/processing/PersonProcessor.java
public class PersonProcessor implements ItemProcessor<Person, Person> {
@Override
public Person process(Person person) {
String firstName = person.firstName().toUpperCase();
String lastName = person.lastName().toUpperCase();
Person transformed = new Person(firstName, lastName);
log.info("Converting {} to {}", person, transformed);
return transformed; // return null here to SKIP this item entirely
}
}
ItemProcessor<Input, Output>. Input and output types can differ — useful when translating between two domain models.
@Bean
public Step basicsStep(JobRepository repo, PlatformTransactionManager tx,
FlatFileItemReader<Person> reader,
PersonProcessor processor,
FlatFileItemWriter<Person> writer) {
return new StepBuilder("basicsStep", repo)
.<Person, Person>chunk(10, tx) // chunk size = 10 items per transaction
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
@Bean
public Job basicsJob(JobRepository repo, Step basicsStep) {
return new JobBuilder("basicsJob", repo)
.start(basicsStep)
.build();
}
Everything assembled with builders. Every piece is a Spring bean — testable, injectable, familiar.
Part 4 · Demo 2
A global ecommerce sales pipeline · 41 markets · nightly batch job
A global ecommerce platform sells across 41 countries. Every night the ordering system dumps a raw CSV export of the day's transactions. The data is messy — mixed case, missing emails, bad amounts. The sales team needs a clean per-country revenue breakdown ready in their dashboard by 08:00.
A nightly Spring Batch job runs at 02:00, validates and normalises every order record, loads the clean data into a reporting database, then aggregates it into a per-country summary CSV that the dashboard reads on startup.
Step 1 — validate & load raw orders
Step 2 — aggregate for the dashboard
The nightly CSV export is generated by multiple regional order systems — they don't agree on formatting conventions.
id,firstName,lastName,email,country,purchaseAmount
1,alex,johnson,alex.j@email.com,south africa,523.45 ← lowercase name & country
2,MARIA,SMITH,,united states,-15.00 ← guest checkout (no email) + bad amount
3,Wei,Chen,wei@test.com,china,341.20 ← clean record
Regional systems export names and countries in different formats. Normalised to "Alex" and "SOUTH AFRICA" for consistent grouping.
~5% of orders come from guests with no account — no email on file. These can't be attributed to a customer, so they're excluded from the report.
Some regional systems write refund entries as negative purchase amounts instead of separate records. These are rejected to avoid skewing revenue figures.
@Override
public Customer process(Customer c) {
// FILTER — return null to skip this record silently
if (c.email() == null || c.email().isBlank()) return null;
if (c.purchaseAmount() == null || c.purchaseAmount() < 0) return null;
// TRANSFORM
return new Customer(
c.id(),
capitalize(c.firstName()), // "ALEX" → "Alex"
capitalize(c.lastName()),
c.email().toLowerCase(),
c.country().toUpperCase(), // "south africa" → "SOUTH AFRICA"
c.purchaseAmount()
);
}
null is the idiomatic Spring Batch way to skip a record — no exception, no special configuration, just return null.
@Bean
public JdbcBatchItemWriter<Customer> customerWriter(DataSource ds) {
return new JdbcBatchItemWriterBuilder<Customer>()
.dataSource(ds)
.sql("""
INSERT INTO customers
(id, first_name, last_name, email, country, purchase_amount)
VALUES
(:id, :firstName, :lastName, :email, :country, :purchaseAmount)
""")
.beanMapped() // maps Java record fields to named SQL parameters
.build();
}
Step 2 needs to aggregate rows by country. No built-in reader does this — so we implement ItemReader ourselves.
public class CountryStatisticsReader implements ItemReader<CountryStatistics> {
private Iterator<CountryStatistics> iterator;
@Override
public CountryStatistics read() {
if (iterator == null) {
// Called once on the first read — load and aggregate everything
List<Customer> all = jdbc.query("SELECT * FROM customers", ...);
Map<String, CountryStatistics> map = new HashMap<>();
for (Customer c : all)
map.merge(c.country(), new CountryStatistics(c), CountryStatistics::merge);
iterator = map.values().iterator();
}
return iterator.hasNext() ? iterator.next() : null; // null = done
}
}
country,customerCount,totalRevenue,avgPurchase
CANADA,227,116600.54,513.66
VIETNAM,243,122213.53,502.94
TURKEY,212,99352.62,468.64
GERMANY,198,95432.11,481.98
...
ICELAND,1,850.75,850.75
(41 countries total)
public record CountryStatistics(
String country,
long customerCount,
double totalRevenue,
double averagePurchaseAmount
) {}
CountryStatisticsProcessor rounds monetary values to 2 decimal places before the writer flushes.
Part 5
void beforeJob(JobExecution e);
void afterJob(JobExecution e);
Logs job name, start/end time, final status.
void beforeStep(StepExecution e);
ExitStatus afterStep(StepExecution e);
Read count, write count, skip count per step.
void beforeChunk(ChunkContext c);
void afterChunk(ChunkContext c);
void afterChunkError(ChunkContext c);
Running totals across each chunk.
new StepBuilder("processStep", repo)
.listener(jobListener).listener(stepListener).listener(chunkListener)
...
| Strategy | What it does | Use when |
|---|---|---|
return null processor |
Silently skips the item — used in this demo for invalid emails and negative amounts. | Bad data is expected and should be excluded. |
.skip(Ex.class).skipLimit(N) |
Skips items that throw a specific exception, up to N times total. | Occasional bad records in otherwise good data. |
.retry(Ex.class).retryLimit(N) |
Retries the chunk when a transient exception occurs. | Flaky external services, transient DB errors. |
| Chunk rollback | Automatic — a failed write rolls back only the current chunk. | Always on, no configuration needed. |
@SpringBatchTest
@SpringBootTest
class BatchIntegrationTest {
@Autowired JobLauncherTestUtils utils;
@Test
void customerJob_completesSuccessfully() throws Exception {
JobExecution exec = utils.launchJob();
assertEquals(COMPLETED, exec.getStatus());
StepExecution step1 = getStep(exec, "processStep");
assertEquals(13, step1.getWriteCount()); // 2 of 15 skipped
}
}
class CountryStatisticsProcessorTest {
@Test
void roundsToTwoDecimals() {
var input = new CountryStatistics(
"CANADA", 10,
116600.5432, 513.6612
);
var result = processor.process(input);
assertThat(result.totalRevenue())
.isEqualTo(116600.54);
assertThat(result.averagePurchaseAmount())
.isEqualTo(513.66);
}
}
DemoJobExecutionListenerbasics-output.csv — 5 uppercased namescountry-statistics.csv — 41 countries with revenue totalsCommandLineRunner. Each run passes a unique timestamp as a job parameter so it can run repeatedly without conflicting with previous instances.
@Scheduled is enough)null from a processor is how you skip items — no special API, just return null.Thanks for your attention
mvn spring-boot:run · docs.spring.io/spring-batch