This is a public Topic to discuss everything related to Quantexa training - content posted here will be visible to all. To raise a question please visit the Academy Q&A Topic
SequenceOfAddresses doesn't seem to be populated in any of my documents
I'm currently looking back through my UI code for the DE academy and when I check the document source, I have found that none of my documents have a sequenceOfAddresses field which is populated, they're all just empty arrays:
So I'm guessing something has gone wrong along the way but I'm not really sure where to look. If anyone can direct me that would be great, cheers!
Best Answers
-
Hey Ben, did you check the output from your CCC step matched the expected counts?
Link to file with expected output numbers: Commands to check your CCC output
E.g.:
Load created parquet into Spark Shell and check the number of Offshore Addresses in your output using the command
`finalOutputDS.select(explode($"addresses")).count`
// Count should be 5310 🔢2 -
Just to add to this too, we have around 215k documents and only 5k have non-third party addresses, which is ~2.5% of our documents, so that's why its hard to find an example with an address.
I think from memory Dunkeld should have an offshore address, however if you want to confirm I would recommend using the spark shell and the CDDM to find an offshore address and search for that in the UI.
1
Comments
-
Cheers for you reply Stuart, yeah I do get the expected count, I guess that mean's they are in-fact being populated, it's just strange as I can't seem to find a document with an associated address but going by the count , they do seem pretty rare so maybe that is why. Thanks again!
0 -
Nested third parties' addresses are larger, mind.
finalOutputDS.select(explode($"sequenceOfThirdParties")).select(explode($"col.sequenceOfAddresses")).count
// Count should be 209965If you substituted the above with a
.show
it might assist your investigation.1