Practical Hadoop Security
Format: PDF / Kindle (mobi) / ePub
Practical Hadoop Security is an excellent resource for administrators planning a production Hadoop deployment who want to secure their Hadoop clusters. A detailed guide to the security options and configuration within Hadoop itself, author Bhushan Lakhe takes you through a comprehensive study of how to implement defined security within a Hadoop cluster in a hands-on way.
You will start with a detailed overview of all the security options available for Hadoop, including popular extensions like Kerberos and OpenSSH, and then delve into a hands-on implementation of user security (with illustrated code samples) with both in-the-box features and with security extensions implemented by leading vendors.
No security system is complete without a monitoring and tracing facility, so Practical Hadoop Security next steps you through audit logging and monitoring technologies for Hadoop, as well as ready to use implementation and configuration examples--again with illustrated code samples.
The book concludes with the most important aspect of Hadoop security – encryption. Both types of encryptions, for data in transit and data at rest, are discussed at length with leading open source projects that integrate directly with Hadoop at no licensing cost.
Practical Hadoop Security:
- Explains importance of security, auditing and encryption within a Hadoop installation
- Describes how the leading players have incorporated these features within their Hadoop distributions and provided extensions
- Demonstrates how to set up and use these features to your benefit and make your Hadoop installation secure without impacting performance or ease of use
and place it in the keytab directory (/etc/security/keytabs) of the respective components (kadmin: is the prompt; commands are in bold): [root@pract_hdp_sec]# kadmin Authenticating as principal root/admin@EXAMPLE.COM with password. Password for root/admin@EXAMPLE.COM: kadmin: xst -k mapred.keytab hdfs/pract_hdp_sec@EXAMPLE.COM HTTP/pract_hdp_sec@EXAMPLE.COM Entry for principal hdfs/pract_hdp_sec@EXAMPLE.COM with kvno 5, encryption type aes128-cts-hmac-sha1-96 added to keytab
tables Driver_details and Ticket_details but only read permission to Judgement_details. The reason is that police officers shouldn’t have permission to change the details of judgment. You will also observe that police officers have write permission to Judgement_details_PO and that is to correct the first two columns (that don’t have any judicial information)—in case there is any error! The next role is for employees working at the IT department: IT_role = server=MyServer->db=db1->table=
Daemon Logs Hadoop daemon logs are logs generated by Hadoop daemons (NameNode, DataNode, JobTracker, etc.) and located under /var/log/hadoop; the actual directories may vary as per the Hadoop distribution used. The available logs are as follows: NameNode logs (hadoop-hdfs-namenode-xxx.log) containing information about file opens and creates, metadata operations such as renames, mkdir, and so forth. DataNode logs (hadoop-hdfs-datanode-xxx.log) containing information about DataNode access and
could have checked all Hive alerts from Nagios to see if RogueITGuy was involved in any other issues. Joins: Allows you to link two completely different data sets together based on a username or event ID field. Using Splunk, I could link monitoring data from Ganglia and Hadoop log data using username RogueITGuy and investigate what else he accessed while performing his known illegal activities. Last, Splunk offers Hunk, which is an analytics tool specifically designed for Hadoop and NoSQL Data.
/home/hadoop/lib/httpclient-4.1.1.jar Now, you’re ready to run the S3DistCp utility and copy a file test1 from HDFS to folder test for S3 bucket htestbucket: > hadoop jar /home/hadoop/lib/emr-s3distcp-1.0.jar -libjars /home/hadoop/lib/gson-2.1.jar,/home/hadoop/lib/emr-s3distcp-1.0.jar,/home/hadoop/lib/EmrMetrics-1.0.jar,/home/hadoop/lib/httpcore-4.1.jar,/home/hadoop/lib/httpclient-4.1.1.jar --src /tmp/test1 --dest s3://htestbucket/test/ --disableMultipartUpload --s3ServerSideEncryption My