As @osamab already explained about Informatica, & i don’t have experience in this tool.
Coming to Pentaho, is suitable for ETL activities in Hadoop, but at what scale they were able to success is i don’t know. We need to remember something that Informatica, Pentaho are enterprise tools & these things are not designed for open in nature, in the sense, these are designed in their company & overall client requirements in mind. But coming to Open Source Systems (StreamSets, NiFi, HDF) are for public, & easily available to customize or leverage them to our needs, but enterprise systems are always comes with limited constraints. Most of the companies are showing interest in Open Source so that cost of the software would go down & can easily controlled.
I know about SAP very well, as i’m working on few advanced systems in SAP & all these CDC/Delta Management is inbuilt. But we need to remember one thing that, all these systems are not suitable for distributed system architecture. So to cater Hadoop scale data ingestion using this Enterprise ETL tools is still a question.
That’s why people are always interested to use Open Source tools, as you will never run out of options.