Skip to main content
New Idea

We should have a csv file validation tool or API to validate data before it gets loaded

Related products:None
phanindra_sambaraju
  • phanindra_sambaraju
    phanindra_sambaraju

phanindra_sambaraju
I think we should have csv file validation before data load in S3 or data load API or through COM. We are seeing lot of issues and it is tough to debug once the data is loaded or if the file is too large. For instance consider a csv with char encoding as ANSI and in the S3 job configuration we configured it as UTF 8. In this scenario it doesn't throw any specific error, but the data load succeeds few times and fails for the most part. But if we have csv validation before hand, we could avoid this and can show proper error message.

Please find below, the use cases for which we need validation
1) Data type mismatch
2) Field name mismatch
3) Char encoding mismatch
4) Blank rows
5) If possible, we need to check for single double quotes between lines

Kindly let me know for any further information.

Thanks,
Phanindra

2 replies

ashok_dugaputi
  • Helper ⭐️⭐️
  • 71 replies
  • November 1, 2016
Phanindra - I agree on the charset level. Are you saying that there is NO validation for Data type & Field name mismatch?

phanindra_sambaraju
Forum|alt.badge.img
For data type mismatch, yes, we are validating and returning failed records with an error message that we are unable to parse data. For field name mismatch, we are throwing "Unknown exception". 

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings