Day 1 – Big Data Project

1.users had issues running impala queries. Out of memory errors . Multiple users running impala queries. Each node had 97GB physical RAM. One user occupied 75GB and so the other user suffered.Impala logs identified missing statistics. Advised user to gather statistics as a more efficient plan can be found.

2. An issue with source system is resolved as the view which was very slow taking more than 20 minutes is now finishing in seconds . Reason is missing statistics on an oracle table .

3. cloudera datascience workbench related discussions concentrated on resource requirements in the virtual machines , local SSD requirements from the users , dependency between master and worker VM’s , repository supporting yum , maven , parcels , docker images , agile delivery of VM’s through a pilot phase in production as this will as well enable us to observe and learn from the behaviour of the users and the technical components ,backup and restore , number of users per VM ,allowing users to download binaries from the internetbut may be using nexus pro for the ability to automatically flag defective binaries etc.




4.users accessing hadoop environment having issues if the AD group has spaces in its name. Short term solution is to request New Active directory group with underscore replacing the spaces. Long term exploration using JNI based solution gound on the net.

5.Saw users interested in weiting analytical queries using lag function in impala.

6. Saw users crashing hue by trying to download huge files from hue. Raised an SR with cloudera which is a iterative process.

7.coordination and team meetings good for team building and ensuring everyone on the same page. Works out well when everyone wants to contribute .

8.process oriented things through emails and adhoc telephone calls have a potentiality to disrupt focus on other urgent things.

9. Adhoc emails asking technical questions like whether both spark sql and spark core modules are supported in CDH 5.10 would take up some investigation time if this information is not already gathered.

Author: admin