FAQ: Elasticsearch "cluster health: Not connected" in Chrome / "Connection refused" script error
FAQ relevant for: all Academy versions Sometimes on the VDIs you will encounter your Elasticsearch being disconnected which will then mean that the data isn't available for easy viewing and it will also lead to errors in your UI. If you try to run e.g. a load Elastic ETL script while Elastic isn't connected then you will get an error, for example: Exception in thread "main" java.net.ConnectException: Connection refused To reconnect the Elasticsearch service you just need to run the following command anywhere in a terminal window on your VDI: sudo systemctl restart elasticsearch.service386Views1like2CommentsFAQ: I can't find Addresses or Individuals when I search in my UI!
FAQ relevant for: all Academy versions If you perform a search in your UI and it only returns eg. Business entities but not Individuals or Addresses, then there are a few things you can check to resolve this issue. Firstly, remember that when you perform a general search (red box below), it will search within any field sets (orange box) you have configured in the search definitions of the resolver config. If you have configured these search definitions for all entity types, then also double check for any spelling errors in the resolver config in terms of fields that you are pointing at in the underlying data - these searches go off of fields in the cleansed Document Data Model. The other place to check is that your elastic has the data you expect in it - do you have address and individual data there? If not, then you may want to review your ETL pipeline, and check out this related FAQ. Lastly - you may need to refresh the security permissions to be able to access these additional Entities. You can do by running: ./drop_recreate_databases.sh In your home directory on the VDI (while the UI isn't running). After this, you should see the additional search fields and be able to find these entities. Let us know in a new post if you are still having issues after performing these checks!1.2KViews1like4CommentsFAQ: I'm trying to run a script and seeing an error about "Unrecognized option: -s"
FAQ relevant for: all Academy versions If you are trying to run a Scala script (e.g. ImportRawToParqet) and you see the above mentioned error, the main cause of this is that the program can't find the relevant JAR file to run. You may also get an error saying something like " .jar not found" or "jar does not exist, skipping." The solution to this error is to build the relevant JAR file(s) as specified in the Spark Shell script that you tried to run the Scala file with. For example, in the "runQSS.sh" script of the academy task project, you will see that this requires two JARs: the data-source-all Project and Dependency shadow JARs. By building these two JARs and verifying that they are in the correct location and have the correct name to match the full file paths in the relevant Spark Shell script, this should fix the issue. Let us know in a new post if the above solution didn't fix the issue for you!721Views1like0CommentsFAQ: I'm missing data in Elasticsearch / my number of docs are wrong
FAQ relevant for: all Academy versions If you have completed the ETL pipeline stages of your project and uploaded the data to ElasticSearch, then when checking your indices in the ElasticSearch Head plugin on Chrome you should have numbers similar to the picture below (to get a bigger version of the image, right click it and chose the option to open it in a new tab). Note: If your numbers vary a little bit from these, for example having 152k address instead of 148k, then that's ok - the numbers will change a little for the resolver indices (Individual/Address/Business) based on the compound keys you have imported in the respective *.qentity Fusion config files. If your numbers are significantly different to this, then you will want to go back through your ETL pipeline and carefully check each stage to see if there is somewhere that you lose the data along the way. A good way to approach this problem is to work forwards from CreateCaseClass and check the output of each stage to find the problem area. You should also use the counts in ElasticSearch to guide you - for example if you have only half the number of businesses listed above, and no individuals, it lets you know that you probably haven't joined your Third Parties onto the ICIJ document properly, and so you would want to go back and double check how you have done this join and on what fields. Specific points to consider: Have I correctly parsed all of the necessary fields in my qmodel files? Have I used the correct type of joins in CreateCaseClass, and have I joined on the correct fields? Have I outputted the correct Dataset at the end of CreateCaseClass? Have I loaded up the DocumentDataModel.parquet (the output of CreateCaseClass) into a Spark-Shell to check the output there? Have I correctly identified and defined all relevant start paths in my qentity files? Do I have a good range of compound keys for each Entity? If you are convinced that you have done all of the above correctly then you can try to clear the data from ElasticSearch, restart the service and then re-upload the data to Elastic using the following three commands: curl -X DELETE 'http://localhost:9200/_all' sudo systemctl restart elasticsearch.service ./runQSS.sh -s com.quantexa.academy.task.icij.model.etl.IcijLoadElasticScript -c ../external.conf -r elastic.icij2.2KViews1like0CommentsFAQ: Broken Elasticsearch Head Chrome Extension Workaround
FAQ relevant for: all Academy versions Hi all, Support for Manifest V2 has officially been disabled in Google Chrome, which means the Elasticsearch Head Chrome extension is no longer functioning properly on the VDIs. When you try to add the Elasticsearch Head Chrome extension you may see the following error: "This extension was turned off because it is no longer supported" or this error: "Cannot install extension because it uses an unsupported manifest version. Could not load manifest." We are looking into ways we can resolve this issue, however for now we have a workaround to get the extension working. Error Workaround Here are the steps you can follow to work around this error: Step 1: Navigate to chrome://flags/#extension-manifest-v2-deprecation-warning in the Chrome browser: Step 2: Disable the following settings: Extension Manifest V2 Deprecation Warning Stage Extension Manifest V2 Deprecation Disabled Stage Extension Manifest V2 Deprecation Unsupported Stage Step 3: Restart Chrome. Step 4: Re-add the Elasticsearch Head Chrome extension. This comment details how to add the extension to Chrome. You should now be able to use the Elasticsearch Head Chrome extension as normal! Apologies for the inconvenience, I hope this helps!371Views1like8CommentsFAQ: Gradle taking too long to index
FAQ relevant for: all Academy versions When opening IntelliJ on the Academy VDIs, dependencies may take too long to load, causing the project to not index properly. This happens because IntelliJ attempts to download dependencies but fails as the VDIs do not have access to the internet. We can stop this happening by putting Gradle in offline mode. To do this you need to do the following: First, click on the Gradle panel near the top of the right sidebar: Then, click on the toggle offline mode button: The button will be a lighter grey colour when the offline mode is on Finally, click the refresh button to restart the indexing: Your project should now index a lot quicker (although it may still take 10-20mins)!FAQ: Academy Documentation Links
Hey all! Here are some useful links for the Academies: ETL Configuring .qmodel files Configuring .qentity files Entities, Start Paths, and traversals Defining Traversals Defining Compounds Defining Elements Quantexa Core Library Core Traversals Core Elements Core Templates Address Business Individual Core Compounds Address Business Individual + Individual Synonyms Core Parsers Address Business If Business parse Business If not Business parse Individual Date to Date parts Entity Resolution Resolver Config Resolution Templates Filtered Compounds Compound Exclusions Network Generation Expansion Steps Scoring (v2.1.8) Severity Tooling Assigning Severity to a Score Configuring Score Descriptions Score Descriptions Score Description Rendering Extracting Values from Configuration files Scorecard Configuration For Comprehensions (Scala Documentation) Project Example - Scoring Best Practices UI Search Configuration Expansion Templates Traversal DSL Functions home page Example Note: The Data Engineering is currently on v2.0.1 of the Quantexa Platform and the Scoring Engineer Academy is currently on v2.1.1, however the closest documentation versions we have are for v2.0.8 and v2.1.8 respectively. These versions should be sufficient for you to complete the Academy! Please comment below if you find any other useful links from the Quantexa Documentation Site so we can add them to the list!598Views1like0CommentsSetting up the training-tutorial project for Technical BA Academy
FAQ relevant for: Technical BA Academy Hi Guys! Some good news for all of you doing the Technical Business Analyst Academy! We've added a new script to your VDIs to simplify the training-tutorial project setup. This script combines all the necessary setup commands into a single script! If you've already set up the training-tutorial project, there's no need to do it again! Also, you will still be able to setup the project manually by following BA Module 1.2: Smoke Test (after completing Module 1.1: Development Tools). The new script is just an additional method for setting up the training-tutorial project for the Scenario-based Tasks. The rest of this post will explain how to get the new script on your VDI, and how to use it to set up the training-tutorial project. How to get the updated setup scripts To add the new script to your VDI, you will need to unzip the setup scripts again. To do this run the following commands in order: cd ~ cp /opt/training/other-resources/setup-scripts/tba-academy/setup_scripts.zip ~ unzip setup_scripts.zip cd ~/setup-scripts-analyst/ chmod +x *.sh If you are prompted with this message: Enter A (or y) to overwrite the existing setup scripts folder with the new folder. Things to Check: Make sure you have the correct setup scripts: Open the /setup-scripts-analyst folder. Check that the file run_training_tutorial_setup.sh is there. If the file is missing, rerun the setup scripts above. Make sure you have the training-tutorial project Check you have the /training-tutorial folder in the home directory of your VDI. If you do not have the training-tutorial project run the following commands: cd ~/setup-scripts-analyst ./run_setup.sh Running the training-tutorial setup script Now that you have the new setup script, you can use it to set up the training-tutorial project! To run the script run the following commands in order: Step 1: Navigate to the /setup-scripts-analyst folder using: cd ~/setup-scripts-analyst Step 2: Run the training-tutorial setup script using: ./run_training_tutorial_setup.sh This script may take up to an hour to fully execute all the commands. Important Note: Do not leave the VDI idle for too long as it will disconnect from inactivity! If this happens, you will need to rerun the script If all the commands execute successfully, you will see this final message in the terminal: Now you should be able to run the Quantexa UI on the training-tutorial project using the run-all script! If you encounter any issues while running this script or have any questions, please let us know in the comments of this post! We hope this helps! Many thanks, The Academy Team196Views1like2CommentsFAQ: How can I handle dates when writing a Quantexa Score in Scala?
Handling Temporal Data Temporal data refers to any data that is associated with a specific point or period in time. This includes the use of dates, times, intervals or timestamps. This information is useful for indicating when an event has occurred. LocalDate One common way of handling temporal data such as dates in your score logic is by using the LocalDate option. This is part of the java.time package. Using LocalDate allows you to manipulate dates, calculate differences between two dates, format dates into strings and parse strings into Date objects. In Scala you can also use the java.time.format.DateTimeFormatter class to format and parse temporal data into specific patterns. Here is an example below: val dateTimeFormatter = DateTimeFormatter.ofPattern("yyyy-MM-dd") The above line will help parse your dates into an appropriate format. LocalDate.parse(yourDate, dateTimeFormatter) The above line ensures that the date is parsed as a LocalDate object and would then look like this: "2024-04-17" Now that we have our localDate we can now perform operations which can be useful when writing your score logic. Commonly Used Operations Comparing dates yourDate.isAfter(anotherLocalDate) yourDate.isBefore(anotherLocalDate) yourDate.isEqual(anotherLocalDate) Extracting the month, year and day yourDate.getMonth() yourDate.getYear() yourDate.getDayOfMonth() Calculating the period between two dates To compare two LocalDates we can use the .between method after having imported the java.time.Period class. We can implement the functionality seen above to extract the period and years in the following example: val duration = Period.between(yourDate, anotherLocalDate) duration.getYears() FAQ: Type Mismatch Errors A common issue experienced in the academy is a type mismatch error when using .addRelatedDate. When using .addRelatedDate for the addition of a dynamic date field to your score, it is recommended that you use a Date type. LocalDate can still however be used within your comparison logic.693Views1like1CommentFAQ: π»οΈ VDI Usage and Common Issues
FAQ relevant for: Academies that require a VDI Your training VDI can be accessed via this link. Before it can be used however, you will need to start the machine that it runs on and you also should stop the machine after you have finished using it. The VDI can be started and stopped via this page on the Quantexa Community, and there is also a refresh button to check the status of the VDI. After you start the VDI, you will need to wait a few minutes for it to boot up before it will be accessible. Common VDI issues & error messages: βYou have been disconnected.β - This message will appear when the VDI hasnβt been started yet. To solve it you just need to start the VDI and wait a few minutes. βAn internal error has occured within the Guacamole server, and the connection has been terminated. If the problem persists, please notify your system administrator, or check your system logs.β - This message will sometimes appear if you have started your VDI but then tried to access it before it has fully booted up. If you want a few more minutes then this should disappear and your VDI should be useable. If you still can't access your VDI after 10-15 minutes, it's likely due to your organization's firewall. βThe Guacamole server is denying access to this connection because you have exhausted the limit for simultaneous connection use by an individual user. Please close one or more connections and try again.β - This message will appear if you try to access the VDI from more than one browser tab at once, but also sometimes can appear at random (possibly caused by network issues). To solve it, close all tabs then wait a second before trying to open a fresh VDI tab. Alternatively you can hit the βLogoutβ button then on the next page the βRe-loginβ button and this usually fixes it too. Poor network conditions may cause this problem to persist. "The remote desktop server is currently unreachable. If the problem persists, please notify your system administrator, or check your system logs." - This error message means there is something wrong with the boot disk of the VDI and will need to be resolved by our cloud team. In this instance, please reach out to an ATL or create a new post in the Academy Q&A topic and we will get the team to resolve it for you. If any of the above problems persist and you are unable to solve them, then please reach out to the training team via the Academy Q&A Topic.2.6KViews1like0Comments