Ask Learn
Preview
Please sign in to use this experience.
Sign inThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This blog is written by Nitin Verma, Sr. Software Engineer, HDInsight.
Do you restart or re-create your HDInsight HBase clusters often? and wished restart/re-create times were faster? if yes, please read on-
This blog introduces a new script for HDInsight HBase service through which you can flush the MemStore of all HBase tables conveniently. The script can significantly reduce the HBase service restart time by avoiding WAL recovery for region edits that has been flushed.
When flush 'table' operation is triggered, all the regions belonging to that table will flush independently. Once the HFile corresponding to a region is flushed, it records the max sequence id in metadata and notifies the WAL corresponding to the regionserver. WAL maintains a mapping table for regions and their corresponding flushed sequence id's. When the HBase cluster restarts, the hMaster will distribute flushed sequence id's per region to the recovery threads splitting the WAL, so that they can skip the edits which have already been persisted in HFiles.
Below are the two ways to run the script.
1. Inside the cluster.
SSH to the head node of the cluster.
wget https://raw.githubusercontent.com/Azure/hbase-utils/master/scripts/flush_all_tables.sh bash ./flush_all_tables.sh
2. From HDInsight Azure portal script action:
a. Login to https://ms.portal.azure.com
b. Select the desired HBase cluster.
c. Click on "Script Actions" button.
d. Click on + Submit New button.
e. Give a short meaningful name (For example: "Flushing all hbase tables")
f. Give Bash Script URI as
https://raw.githubusercontent.com/Azure/hbase-utils/master/scripts/flush_all_tables.sh
g. Select just Head and deselect Region and Zookeeper nodes.
h. Give hn1 as parameter, so that script will execute on the idle headnode.
i. Click on Create button.
The progress of script can be monitored from Ambari UI by accessing "ops button", which shows active operation count in blue.
Please sign in to use this experience.
Sign in