Saturday, July 25, 2015

HBase Spring Data Sample

Document  Version 1.0

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:context="http://www.springframework.org/schema/context"
    xmlns:hdp="http://www.springframework.org/schema/hadoop"
    xmlns:p="http://www.springframework.org/schema/p"
    xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsdxmlns
    http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
    http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd">

    <context:component-scan base-package="springdt" />
    <context:property-placeholder location="hbase.properties"/>

    <hdp:configuration id="hadoopConfiguration">
      fs.defaultFS=hdfs://127.0.0.1:9000
    </hdp:configuration>

    <hdp:hbase-configuration configuration-ref="hadoopConfiguration" zk-quorum="127.0.0.1" zk-port="2181"/>

    <bean id="hbaseTemplate" class="org.springframework.data.hadoop.hbase.HbaseTemplate">
        <property name="configuration" ref="hbaseConfiguration"/>
    </bean>
<bean id="hBaseService" class="springdt.HBaseService"/>

</beans>


The "HbaseService" bean is the only bean of this sample. We will show the code right away.

4. Write code accessing HBase

In this sample we will create a table in HBase called "report", and try store some data in the table. The table has one column family called "data", and only one column named "file". The value of "file" is file name, which could be report name in real life.

package springdt;

import javax.inject.Inject;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.client.HTableInterface;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.support.AbstractApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;
import org.springframework.data.hadoop.hbase.HbaseTemplate;
import org.springframework.data.hadoop.hbase.TableCallback;
import org.springframework.stereotype.Service;

/**
 * Demonstrating how to access HBase using Spring Data for Hadoop.
 * 
 * @author swang
 * 
 */
@Service
public class HBaseService {

@Autowired
private Configuration hbaseConfiguration;

@Inject
private HbaseTemplate hbTemplate;

// Table info
final String tableName = "report";
final String columnFamilyData = "data";
final String colFile = "file";
final String rowNamePattern = "row";
final String value = "report24.csv-";

/**

* @throws Exception
*/

public void run() throws Exception {
// 1. create table
createTable();
// 2. add data entry
addData();
}

/**
* Creates HBase table

* @throws Exception
*/
public void createTable() throws Exception {
HBaseAdmin admin = new HBaseAdmin(hbaseConfiguration);

if (admin.tableExists(tableName)) {
admin.disableTable(tableName);
admin.deleteTable(tableName);
}

HTableDescriptor tableDes = new HTableDescriptor(tableName);
HColumnDescriptor cf1 = new HColumnDescriptor(columnFamilyData);
tableDes.addFamily(cf1);
admin.createTable(tableDes);
}

/**
* Adds data entry for report.
*/
private void addData() {
hbTemplate.execute(tableName, new TableCallback<Boolean>() {

public Boolean doInTable(HTableInterface table) throws Throwable {
for (int i = 0; i < 1000; i++) {
Put p = new Put(Bytes.toBytes(rowNamePattern + i));
p.add(Bytes.toBytes(columnFamilyData),
Bytes.toBytes(colFile), Bytes.toBytes(value + i));
table.put(p);
}
return new Boolean(true);
}
});
}

public static void main(String[] args) throws Exception {
AbstractApplicationContext ctx = new ClassPathXmlApplicationContext(
"SpringBeans.xml");

HBaseService hBaseService = (HBaseService) ctx.getBean("hBaseService");
hBaseService.run();
}
}

The "SpringBeans.xml" we created in step 3 should be put in "src/main/resources", so that it could be found at runtime.

5. Run the sample

We assume you have HBase over Hadoop installed. Make sure the HBase version is compatible with Hadoop. In my case I have "hbase-1.0.1.1" and "hadoop-2.7.0".
Now start HBase server.
Run "HBaseService" in Eclipse as java application. In Eclipse console  you will see output like this:




2015-07-25 15:10:52 INFO  ClientCnxn:852 - Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session
2015-07-25 15:10:52 INFO  ClientCnxn:1235 - Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x14ec4b8cf30000e, negotiated timeout = 90000
2015-07-25 15:10:53 INFO  HBaseAdmin:978 - Started disable of report
2015-07-25 15:10:55 INFO  HBaseAdmin:1033 - Disabled report
2015-07-25 15:10:55 INFO  HBaseAdmin:738 - Deleted report


"HBaseService" should have created 1000 ehtries in HBase table "report". Check this with hbase shell:



hbase(main):003:0> scan "report"
ROW          COLUMN+CELL
row0          column=data:file, timestamp=1437823593865, value=report24.csv-0
row1          column=data:file, timestamp=1437823593897, value=report24.csv-1
...
row999     column=data:file, timestamp=1437823603565, value=report24.csv-999 
1000 row(s) in 2.0600 seconds



Now we are done with the simple example. Have fun!