Friday, December 9, 2016

My First Amazon Skill...

I know this is irrelevant in my Oracle BLOB but hey this is my blog so I can blog anything I want :)

Anyways, I got hooked on to this new gadget Amazon Echo over Thanks Giving and we are loving it.

So, I said why not I create my own Skill to play Telugu Radio Stations that I always listen to in my phone!!!

That led me to explore and finally got it working.

You can get my code from GitHub.

Please let me know your feedback and please remember that this is my first Skill and first upload to GitHub so be generous with comments please :)


Saturday, November 5, 2016

Convert xlsx to csv in Linux...

You can use Open Office for Linux to open an xlsx file and then save as csv. But this takes a bit of time to load the file in Open Office and then do convert.

Also, what if there are multiple files that you want to convert?

You need to have "ssconvert" which does convert the file in command line and does pretty good job.

Download the below two rpm's and install them using rpm -Uvh:

http://rpm.pbone.net/index.php3/stat/4/idpl/28272017/dir/fedora_21/com/goffice-0.10.18-1.fc21.x86_64.rpm.html

http://rpm.pbone.net/index.php3/stat/4/idpl/28271945/dir/fedora_21/com/gnumeric-1.12.18-1.fc21.x86_64.rpm.html

gnumeric rpm has ssconvert. Once installed, you can simply use as:
[santhosh@localhost myspark]$ ssconvert 'Member accounts 11042016.xls' memberac.csv



Wednesday, November 2, 2016

Spark Aggregate function with examples...

Aggregate function takes three parameters (arguments).
1st parameter is the seed value which is (0,1) in most cases
2nd parameter is the computation reduce function
3rd parameter is the combine reduce function

The aggregate function allows the user to apply two different reduce functions to the RDD at the same time.
The first reduce function is applied within each partition to reduce the data within each partition into a single result.
The second reduce function is used to combine the results from all the partitions from the first reduce function.

The ability to have two separate reduce functions for intra partition versus across partition reducing adds a lot of flexibility. For example the first reduce function can be the max function and the second one can be the sum function. The user also specifies an initial value. Here are some important facts.
  • The initial value (1st parameter seed value) is applied at both levels of reduce. So both at the intra partition reduction and across partition reduction.
  • Both reduce functions have to be commutative and associative.
  • Do not assume any execution order for either partition computations or combining partitions.
Lets start with simple examples. Note: changing the argument values to the combOp will gives you drastic results as the 2nd reduce function is combining the results from the seqOp (Sequential Operation) results from the individual partitions. 


From the above:
seqOp ==> is the first reduce function which occurs in across all the nodes
combOp ==> is the second reduce function which combines the result from seqOp

seqOp and combOp are being called with "aggregate" function with 2 parameters for the "collData" collection as collData.aggregate ( (0,0).
First argument 0 is being passed to seqOp lambda function with "x[0] + y", and the second argument is passed to "x[1] + y".
combOp is just doing the sum of the results from seqOp from all the nodes.

So, from the collData list, for every element (1,2,3,4,5) argument x[0] and x[1] is used to apply to the equation. In this example we are passing 0 for both arguments, it will just add 0 to each element and gives the results as 15 which is (0+1) + (0+2) + (0+3) + (0+4) + (0+5).

Lets change the aggregate arguments to 0, 1 and see the behavior.



As you can see the result is changed to 15,2 20. Why?
Because, we changed the second argument value to 1 and results in 20 like showed below:
x[0] + y will be iterated as:
(0 + 1) + (0 + 2) + (0 + 3) + (0 + 4) + (0 + 5)
= 1 + 2 + 3 + 4 + 5
= 15
x[1] + y will be iterated as:
(1 + 1) + (1 + 2) + (1 + 3) + (1 + 4) + (1 + 5)
=2 + 3 + 4 + 5 + 6
= 20

Lets changes the arguments to 2,3 and see the results:









Here is how the iterations occurred to get the results 25, 30
(2 + 1) + (2 + 2) + (2 + 3) + (2 + 4) + (2 + 5)
= 3 + 4 + 5 + 6 + 7
25
x[1] + y will be iterated as:
(3 + 1) + (3 + 2) + (3 + 3) + (3 + 4) + (3 + 5)
=4 + 5 + 6 + 7 + 8
30

Here is the another example with sum and multiplication of the elements:









From the above, notice that the second arguments to the lambda function is the multiplication (*) instead of summation (+).
What we are doing in the above example is, getting the sum of all the elements and multiplication of all the elements in the collection at the same time.
Here is how the iteration with the arguments passed to "aggregate" function:
seqOP lambda (x[0] + y, x[1] * y) is iterated thru all the elements in the collData with (0, 1)

-------------------- x[0] + y -------------------, -------------------- x[0] * y ------------------
( (0 + 1) + (0 + 2) (0 + 3) (0 + 4) (0 + 5) , (1 * 1) * (1 * 2) * (1 * 3) * (1 * 4) * (1 * 5) )
= (( 1 + 2 + 3 + 4 + 5), (1 * 2 * 3 * 4 * 5))
= (15, 120)

Lets change the argument values to aggregate and see the result:








This time we passed arguments (2, 3) to aggregate.
Here is how the iteration with the arguments (2, 3):
seqOP lambda (x[0] + y, x[1] * y) is iterated thru all the elements in the collData with (0, 1)

-------------------- x[0] + y -------------------, -------------------- x[0] * y ------------------
( (2 + 1) + (2 + 2) (2 + 3) (2 + 4) (2 + 5) , (3 * 1) * (3 * 2) * (3 * 3) * (3 * 4) * (3 * 5) )
= (( 3 + 4 + 5 + 6 + 7), (3 * 6 * 9 * 12 * 15))
= (25, 29160)

Thursday, October 6, 2016

CDH 5.8.2 Login Page error 500

You may get this error when there is any kind of interruption in cloudera server process. In my case, I get this every time I bring up my pc after either hibernate or sleep.

Just restart the cloudera server and wait for it come up and then try login again and you should be good:

[root@localhost conf]# service cloudera-scm-server restart
Restarting cloudera-scm-server (via systemctl):            [  OK  ]
[root@localhost conf]# 

HTTP ERROR 500

Problem accessing /cmf/login. Reason:
    Error creating bean with name 'newServiceHandlerRegistry' defined in class path resource [com/cloudera/server/cmf/config/components/BeanConfiguration.class]: Instantiation of bean failed; nested exception is org.springframework.beans.factory.BeanDefinitionStoreException: Factory method [public com.cloudera.cmf.service.ServiceHandlerRegistry com.cloudera.server.cmf.config.components.BeanConfiguration.newServiceHandlerRegistry()] threw exception; nested exception is java.lang.IllegalStateException: BeanFactory not initialized or already closed - call 'refresh' before accessing beans via the ApplicationContext

Caused by:

org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'newServiceHandlerRegistry' defined in class path resource [com/cloudera/server/cmf/config/components/BeanConfiguration.class]: Instantiation of bean failed; nested exception is org.springframework.beans.factory.BeanDefinitionStoreException: Factory method [public com.cloudera.cmf.service.ServiceHandlerRegistry com.cloudera.server.cmf.config.components.BeanConfiguration.newServiceHandlerRegistry()] threw exception; nested exception is java.lang.IllegalStateException: BeanFactory not initialized or already closed - call 'refresh' before accessing beans via the ApplicationContext
 at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:581)
 at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:983)
 at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:879)
 at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:485)
 at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:456)
 at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:293)
 at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
 at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:290)
 at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:192)
 at org.springframework.beans.factory.support.DefaultListableBeanFactory.findAutowireCandidates(DefaultListableBeanFactory.java:848)
 at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:790)
 at org.springframework.beans.factory.support.DefaultListableBeanFactory.resolveDependency(DefaultListableBeanFactory.java:707)
 at org.springframework.beans.factory.support.ConstructorResolver.resolveAutowiredArgument(ConstructorResolver.java:795)
 at org.springframework.beans.factory.support.ConstructorResolver.resolvePreparedArguments(ConstructorResolver.java:765)
 at org.springframework.beans.factory.support.ConstructorResolver.autowireConstructor(ConstructorResolver.java:131)
 at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.autowireConstructor(AbstractAutowir

Exception in secureMain java.io.IOException: failed to stat a path component: '/var/run/hdfs-sockets'.

This is in CDH 5.8.2

Data Node not able to start due to:

FATALorg.apache.hadoop.hdfs.server.datanode.DataNode
Exception in secureMain
java.io.IOException: failed to stat a path component: '/var/run/hdfs-sockets'.  error code 2 (No such file or directory)
Creating a folder 'hdfs-sockets' under /var/run and changing owner:group to cloudera-scm solves the problem but this folder gets removed after the reboot.


How to disable "information" messages in pyspark (python spark) console?


You may notice bunch of messages poping up, as showed below, on the console when you initiate spark console using pyspark.
How do you disable these messages and only show the errors?

[santhosh@localhost Downloads]$ pyspark
Python 2.7.5 (default, Sep 14 2016, 08:35:31)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
16/10/06 11:25:20 INFO spark.SparkContext: Running Spark version 1.6.0
16/10/06 11:25:21 INFO spark.SecurityManager: Changing view acls to: santhosh
16/10/06 11:25:21 INFO spark.SecurityManager: Changing modify acls to: santhosh
.
.
.
16/10/06 11:25:30 INFO storage.BlockManagerMaster: Registered BlockManager
16/10/06 11:25:30 INFO scheduler.EventLoggingListener: Logging events to hdfs://localhost:8020/user/spark/applicationHistory/application_1475694807177_0010
16/10/06 11:25:31 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 1.6.0
      /_/

Using Python version 2.7.5 (default, Sep 14 2016 08:35:31)
SparkContext available as sc, HiveContext available as sqlContext.
>>>

Solution:
Edit the file /etc/spark/conf/log4j.properties
and change the value for logger from INFO to ERROR
root.logger=INFO,console --> root.logger=ERROR,console



Tuesday, September 27, 2016

Installing Google Chrome on Oracle Linux...

These steps are for Oracle Linux 64 bit:

Create a file google.repo under /etc/yum.repos.d/ folder with the following content.

[google-chrome]
name=google-chrome – 64-bit
baseurl=http://dl.google.com/linux/chrome/rpm/stable/x86_64
enabled=1
gpgcheck=1
gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub

Save the file and execute the below command to install Chrome:

# yum install google-chrome-stable

Accept the prompts and once done you will see Chrome under Applications/Internet

Edit:
Update as of 12/03/2018:
Looks like something broke in Oracle Linux 7.6 as my above mentioned was not successful in my new install.

I was getting the below error:

Error: Package: google-chrome-stable-70.0.3538.110-1.x86_64 (google-chrome)
           Requires: liberation-fonts
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

Here is what made me going:

1] in the google.repo file, replace x86_64 with $basearch
baseurl=https://dl.google.com/linux/chrome/rpm/stable/$basearch
2]Using wget with Oracle Linux 7:

# wget -O /etc/yum.repos.d/public-yum-ol7.repo http://yum.oracle.com/public-yum-ol7.repo  

3] Edit the file /etc/yum.repos.d/public-yum-ol7.repo
    set 1 to enabled flag for the below sections:
    [ol7_software_collections]
    [ol7_latest]
4] run yum install scl.utils
    #yum install scl.utils

5] now run yum install google-chrome-stable




Friday, September 9, 2016

CDH 5 (Clourera) Single User Mode Installation

I am not going to write too much about Cloudera as there is tons of information available in the world of Google.

What I am going to write here is the simple steps to setup Cloudera in Single User Mode on Oracle Linux 7.

You can download Oracle Linux 7 from https://edelivery.oracle.com/osdc/faces/Home.jspx (you need to have metalink access).

Install Oracle Linux with  Server with GUI as Base Environment and Compatibility Libraries and Development Tools as Add-Ons.

I have spent couple of long hours to install Cloudera Manager with so many different errors and what not. So, Finally came out with these simple steps and scripts to setup the CDH 5 environment up and running in probably about an hour.

Used laptop configuration:
i5 processor with 16GB RAM (good luck if you have anything less than this)
OS: OL7

Steps:
01] Disable SELinux - edit /etc/selinux/config file and change the value for SELINUX to disabled and reboot and make sure "getenforce" command returns Disabled.

02] Add cloudera-scm user group to SUDO list:
#visudo
and then add the below line
%cloudera-scm ALL=(ALL) NOPASSWD:ALL
Also add below lines - believe me you are going to hit some errors if you dont.
Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin
Defaults env_keep +="JAVA_HOME"

comment this line 'Defaults requiretty'

03] Add following line to /etc/pam.d/su file
session required pam_limits.so

04] Disable ipv6:

edit /etc/sysctl.conf and add the below:

# to disable IPv6 on all interfaces system wide
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1

# to disable IPv6 on a specific interface (e.g., eth0, lo)
net.ipv6.conf.enp0s25.disable_ipv6 = 1

05] Make sure your /etc/host file has only this line and nothing else (because this is Single User Mode and just 127.0.0.1 is enough)

127.0.0.1 localhost

06] Now is the fun part. This is where I had to spend way too much time figuring out what causing the issue. Even though we added cloudera-scm into SUDOers list, and logged in as root, the installation part was not able to create any directories even after providing root password and password less SSH. So ended up in capturing every single folder that is needed with appropriate mode and listed in the below script. Run this script to create all required folders.

======Begin=====
mkdir -vp /var/lib/zookeeper
chown cloudera-scm:cloudera-scm /var/lib/zookeeper
chmod 775 /var/lib/zookeeper
mkdir -vp /var/log/zookeeper
chown cloudera-scm:cloudera-scm /var/log/zookeeper
chmod 775 /var/log/zookeeper
mkdir -vp /var/lib/zookeeper/version-2
chown cloudera-scm:cloudera-scm /var/lib/zookeeper/version-2
chmod 775 /var/lib/zookeeper/version-2
mkdir /cloudera_manager_zookeeper_canary
chown cloudera-scm:cloudera-scm /cloudera_manager_zookeeper_canary
chmod 775 /cloudera_manager_zookeeper_canary
mkdir -vp /var/lib/cloudera-scm-eventserver
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-eventserver
chmod 775 /var/lib/cloudera-scm-eventserver
mkdir -vp /var/log/cloudera-scm-eventserver
chown cloudera-scm:cloudera-scm /var/log/cloudera-scm-eventserver
chmod 775 /var/log/cloudera-scm-eventserver
mkdir -vp /var/lib/cloudera-host-monitor /var/log/cloudera-host-monitor
chown cloudera-scm:cloudera-scm /var/lib/cloudera-host-monitor/ /var/log/cloudera-host-monitor/
chmod 775 /var/lib/cloudera-host-monitor/ /var/log/cloudera-host-monitor/
mkdir -vp /var/lib/cloudera-scm-firehose /var/log/cloudera-scm-firehose
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-firehose /var/log/cloudera-scm-firehose
chmod 775 /var/lib/cloudera-scm-firehose /var/log/cloudera-scm-firehose
mkdir -vp /var/lib/cloudera-scm-alertpublisher /var/log/cloudera-scm-alertpublisher
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-alertpublisher /var/log/cloudera-scm-alertpublisher
chmod 775 /var/lib/cloudera-scm-alertpublisher /var/log/cloudera-scm-alertpublisher
mkdir -vp /var/lib/cloudera-scm-headlamp /var/log/cloudera-scm-headlamp
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-headlamp /var/log/cloudera-scm-headlamp
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-headlamp /var/log/cloudera-scm-headlamp
chmod 775 /var/lib/cloudera-scm-headlamp /var/log/cloudera-scm-headlamp
mkdir -vp /var/lib/cloudera-service-monitor /var/log/cloudera-service-monitor
chown cloudera-scm:cloudera-scm /var/lib/cloudera-service-monitor /var/log/cloudera-service-monitor
chmod 775 /var/lib/cloudera-service-monitor /var/log/cloudera-service-monitor
mkdir -vp /var/lib/cloudera-scm-navigator /var/log/cloudera-scm-navigator
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-navigator /var/log/cloudera-scm-navigator
chmod 775 /var/lib/cloudera-scm-navigator /var/log/cloudera-scm-navigator
mkdir -vp /var/lib/hadoop-hdfs /var/log/hadoop-hdfs
chown cloudera-scm:cloudera-scm /var/lib/hadoop-hdfs /var/log/hadoop-hdfs
chmod 775 /var/lib/hadoop-hdfs /var/log/hadoop-hdfs
mkdir -vp /dfs
chown cloudera-scm:cloudera-scm /dfs
chmod 775 /dfs
mkdir -vp /var/run/hdfs-sockets
chown cloudera-scm:cloudera-scm /var/run/hdfs-sockets
chmod 755 /var/run/hdfs-sockets/

#java.io.IOException: the path component: '/var/run/hdfs-sockets' is group-writable, and the group is not root.  Its permissions are 0775, and it is owned by gid 981.  Please fix this or select a different socket path.
#changed to 755
#chmod 755 hdfs-sockets

mkdir -vp /var/lib/solr /var/log/solr
chown cloudera-scm:cloudera-scm /var/lib/solr
chmod 775 /var/lib/solr
mkdir -vp /var/lib/hbase /var/log/hbase
chown cloudera-scm:cloudera-scm /var/lib/hbase /var/log/hbase
chown cloudera-scm:cloudera-scm /var/log/solr
chmod 775 /var/lib/solr /var/log/solr
chmod 775 /var/lib/hbase /var/log/hbase


mkdir -vp /var/log/hbase-solr /var/lib/hbase-solr
chown cloudera-scm:cloudera-scm /var/log/hbase-solr /var/lib/hbase-solr
chmod 775 /var/log/hbase-solr /var/lib/hbase-solr


mkdir -vp /var/lib/hadoop-yarn /var/log/hadoop-yarn /var/log/hadoop-mapreduce
chown cloudera-scm:cloudera-scm /var/lib/hadoop-yarn /var/log/hadoop-yarn /var/log/hadoop-mapreduce
chmod 775 /var/lib/hadoop-yarn /var/log/hadoop-yarn /var/log/hadoop-mapreduce


mkdir /yarn
chown cloudera-scm:cloudera-scm /yarn
chmod 755 /yarn


mkdir -vp /var/log/spark /var/log/hive
chown cloudera-scm:cloudera-scm /var/log/spark /var/log/hive
chmod 775 /var/log/spark /var/log/hive

mkdir -vp /var/lib/hadoop-httpfs /var/lib/oozie /var/lib/sqoop2 /var/lib/solr
chown cloudera-scm:cloudera-scm /var/lib/hadoop-httpfs /var/lib/oozie /var/lib/sqoop2 /var/lib/solr
chmod 775 /var/lib/hadoop-httpfs /var/lib/oozie /var/lib/sqoop2 /var/lib/solr


mkdir -vp /var/log/oozie /var/lib/oozie /var/log/catalogd /var/log/impalad /var/log/impala-minidumps /var/lib/impala /var/log/statestore
chown cloudera-scm:cloudera-scm /var/log/oozie /var/lib/oozie /var/log/catalogd /var/log/impalad /var/log/impala-minidumps /var/lib/impala /var/log/statestore
chmod 775 /var/log/oozie /var/lib/oozie /var/log/catalogd /var/log/impalad /var/log/impala-minidumps /var/lib/impala /var/log/statestore
mkdir /impala
chown cloudera-scm:cloudera-scm /impala
chmod 755 /impala


#Saw warning and did the below:

chmod 755 /var/lib/oozie
chmod 755 /var/log/oozie


mkdir /var/log/hue
chown cloudera-scm:cloudera-scm /var/log/hue
chmod 775 /var/log/hue

====== End ======

07] Once all the above folders created, Download CDH Installer:
wget https://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin

08] chmod to execute:
chmod u+x cloudera-manager-installer.bin

09] Execute Installer:
./cloudera-manager-installer.bin

Follow the steps from the GUI. Once all done, you will see a notification stating that localhost:7180 will open in browser. Press the close button and you will see browser open's up with that link - dont panic if you see page not found or no connection error. Wait for couple of minutes and then refresh and continue the installation.

Un-Installing CDH:

In case if you want to wipe out (un-install) and  start over CDH installation again then run the following to un-install and remove all the folders and start over from Step#6:

====== Begin ======
/usr/share/cmf/uninstall-cloudera-manager.sh
read -p "Enter to continue..."

service cloudera-scm-server stop
read -p "Enter to continue..."
service cloudera-scm-server-db stop
read -p "Enter to continue..."
yum remove cloudera-manager-server
read -p "Enter to continue..."
yum remove cloudera-manager-server-db-2
read -p "Enter to continue..."
service cloudera-scm-agent next_stop_hard
read -p "Enter to continue..."
service cloudera-scm-agent stop
read -p "Enter to continue..."
yum remove 'cloudera-manager-*'
read -p "Enter to continue..."
sudo yum remove 'cloudera-manager-*' avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-core spark-master spark-worker spark-history-server spark-python sqoop sqoop2 whirr hue-common oozie-client solr solr-doc sqoop2-client zookeeper

read -p "Enter to continue..."
yum clean all
read -p "Enter to continue..."
umount cm_processes
read -p "Enter to continue..."
rm -rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera* /var/log/cloudera* /var/run/cloudera*
read -p "Enter to continue..."
rm /tmp/.scm_prepare_node.lock
read -p "Enter to continue..."

rm -rf /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper
read -p "Enter to continue..."

rm -rf /var/lib/zookeeper
rm -rf /var/log/zookeeper
rm -rf /var/lib/zookeeper/version-2
rm -rf /cloudera_manager_zookeeper_canary
rm -rf /var/lib/cloudera-scm-eventserver
rm -rf /var/log/cloudera-scm-eventserver directory
rm -rf /var/lib/cloudera-scm-firehose /var/log/cloudera-scm-firehose
rm -rf /var/lib/cloudera-scm-alertpublisher /var/log/cloudera-scm-alertpublisher
rm -rf /var/lib/cloudera-scm-headlamp /var/log/cloudera-scm-headlamp
rm -rf /var/lib/cloudera-service-monitor /var/log/cloudera-service-monitor
rm -rf /var/lib/cloudera-scm-navigator /var/log/cloudera-scm-navigator
rm -rf /var/lib/hadoop-hdfs /var/log/hadoop-hdfs
rm -rf /dfs
rm -rf /var/run/hdfs-sockets
rm -rf /var/lib/solr /var/log/solr
rm -rf /var/lib/hbase /var/log/hbase
rm -rf /var/log/hbase-solr /var/lib/hbase-solr
rm -rf /var/lib/hadoop-yarn /var/log/hadoop-yarn /var/log/hadoop-mapreduce
rm -rf /yarn
rm -rf /var/log/spark /var/log/hive
rm -rf /var/lib/hadoop-httpfs /var/lib/oozie /var/lib/sqoop2 /var/lib/solr
rm -rf /var/log/oozie /var/lib/oozie /var/log/catalogd /var/log/impalad /var/log/impala-minidumps /var/lib/impala /var/log/statestore
rm -rf /var/run/hdfs-sockets /var/log/hue
rm -rf /impala
rm -rf /usr/lib64/cmf

====== End ======

Saturday, September 3, 2016

Disable SELinux

Edit the file /etc/selinux/config and change the value for SELINUX to disabled and reboot.

Verify with 'getenforce' command to see the state.

Monday, February 29, 2016

ORA-27492: unable to run job : scheduler unavailable

Check JOB_QUEUE_PROCESSES parameter value.

User was able to execute the job manually by selecting "Use Current Session" option within TOAD. But gets an error ORA-27492 when choosing "Do not use current session" option.

Alter session set job_queue_processes = 10; (based on #of jobs you will be executing simultaneously)

ORA-27492: unable to run job "JOB_NAME": scheduler unavailable
ORA-06512: at "SYS.DBMS_ISCHED", line 185
ORA-06512: at "SYS.DBMS_SCHEDULER", line 486
ORA-06512: at line 2