Experience with Oracle Big Data Appliance

1. We have around 26 big data appliance nodes in production . We are using Cloudera CDH 5.15.2 as our Hadoop distribution .

2. Recently when we added new nodes to the production cluster , the new CPU’s in the new nodes were not compatible with CDH 5.13. So we were not able to scale impala on these new nodes . As a result we were not able to rebalance the data across all the nodes . Yarn jobs were able to run fine .After upgrading to 5.15.2 the issue with the new CPU’s was gone.

3. We are using Informatica BDM ( big data management ) for our ETL loads and Informatica EDC( Enterprise data catalog) for our metadata management. Due to numerous incompatibility issues and dependencies between different versions of CDH and these Informatica products , we migrated to spark workflows and did not extend the license for the EDC.



4. You will need a dedicated team for security patching , hardware support , CDH upgrade in the Big Data Appliances . The team should already have experience with the BDA’s. Moreover the CDH upgrade in the BDA’s is handled through the scripts provided by Oracle . This is not straight forward when having a secure cluster . One should have experience with the Unix operating system , Unix admin skills , Know-how of the different Hadoop services , strong Kerberos knowledge in order to be able to handle issues that you encounter during the upgrade .We were lucky to have strong internal resources and good experienced resource from oracle who were able to handle all the issues and made sure the upgrades happened smoothly .

5.Testing of all our ETL flows and communication with our business counterparts was an essential activity which consumed time before and during the upgrades . If the testing activity is handled by a managed services provider , they will have to Zero based testing before the upgrade to prove that the upgrade did not break existing ETL flows and other functionalities . One can imagine the time this would require for development, test , acceptance(pre-production) and production.

Author: admin