Welcome to the Quantexa Academy Topic

This is a public Topic to discuss everything related to Quantexa training - content posted here will be visible to all. To raise a question please visit the Academy Q&A Topic

SequenceOfAddresses doesn't seem to be populated in any of my documents

Ben_Ryan
Ben_Ryan Posts: 2 QUANTEXA TEAM

I'm currently looking back through my UI code for the DE academy and when I check the document source, I have found that none of my documents have a sequenceOfAddresses field which is populated, they're all just empty arrays:

So I'm guessing something has gone wrong along the way but I'm not really sure where to look. If anyone can direct me that would be great, cheers!

Best Answers

  • Stuart_Johnson
    Stuart_Johnson Posts: 5 QUANTEXA TEAM
    Answer Accepted ✓

    Hey Ben, did you check the output from your CCC step matched the expected counts?

    Link to file with expected output numbers: Commands to check your CCC output

    E.g.:

    Load created parquet into Spark Shell and check the number of Offshore Addresses in your output using the command
    `finalOutputDS.select(explode($"addresses")).count`
    // Count should be 5310 🔢

  • Sian_Ayres
    Sian_Ayres Posts: 513 QUANTEXA TEAM
    Answer Accepted ✓

    Just to add to this too, we have around 215k documents and only 5k have non-third party addresses, which is ~2.5% of our documents, so that's why its hard to find an example with an address.

    I think from memory Dunkeld should have an offshore address, however if you want to confirm I would recommend using the spark shell and the CDDM to find an offshore address and search for that in the UI.

Comments

  • Ben_Ryan
    Ben_Ryan Posts: 2 QUANTEXA TEAM

    Cheers for you reply Stuart, yeah I do get the expected count, I guess that mean's they are in-fact being populated, it's just strange as I can't seem to find a document with an associated address but going by the count , they do seem pretty rare so maybe that is why. Thanks again!

  • Stuart_Johnson
    Stuart_Johnson Posts: 5 QUANTEXA TEAM

    Nested third parties' addresses are larger, mind.

    finalOutputDS.select(explode($"sequenceOfThirdParties")).select(explode($"col.sequenceOfAddresses")).count
    // Count should be 209965

    If you substituted the above with a .show it might assist your investigation.

User Profiles
Academy Topic Owners
Feel free to ask our Topic Owners a question on all things related to our Academy
Academy Team Lead
Academy Team Lead
Academy Team Lead
Recommend a
Friend & WIN
prizes!
Learn more
Academy Q&A