spark

5 Topics

New guide 📖 Scoring Concepts: Write-Once Steps
Discover how to efficiently use Write-Once Steps in the Assess framework for data transformation. This detailed guide complements the Write-Once Steps documentation and helps you determine when to apply Write-Once Steps effectively in both Batch and Dynamic Scoring contexts. Key topics covered: The potential cost of using the wrong method When to use Write-Once Steps vs. Logical Sources Strategies for scoring networks in Batch (SparkScoringContext) and Dynamic (DynamicScoringContext) environments Gain a deeper understanding of how to avoid duplicating logic across contexts and streamline your data engineering workflows. Read the full article (login required) to explore practical scenarios and best practices for scoring networks with Write-Once Steps: 5. Scoring Concepts: Write-Once Steps - Quantexa Community This article serves as an extension of the Product documentation of Write-Once Steps and provides a guide on which situation should the Write-Once Steps be used. Introduction The Write-Once Step is a data transformation step in the Assess framework, which is executable in both Batch and Dynamic contexts, meaning you do not…
Zhenyu_Li
10 months ago Place Getting Started
38Views
0likes
0Comments
Why Does Entity Quality Matter? & Best of the Community from March
March Top Picks Why Entity Quality Matters 🔐login required Enter our latest competition: The Knowledge Exchange 📝 New Education Programs: Scala & Spark Bootcamp and Quantexa Data Engineer Velocity Program Quantexa & Xander Talent - New Education Partnership 🤝 Elevating Data Management: Unveiling the Pillars of a Trusted Data Foundation with Quantexa Latest from the Community Library 📖 A day in the life of a... Senior Learning Designer A day in the life of... an Academy Trainee How To Test Your Upgrades 🔐login required Updates to Quantexa Supported Versions 🔐login required Upcoming events 🗓️ 3rd May Community Connect 👋 Join for a demo of top Community features. Best of Q&A ✅ Unable to load Batch Scores to Elastic 🔐login required How to fix image style for investigation icon in the Qx UI? 🔐login required Error creating bean with name 'springSecurityFilterChain' 🔐login required New & Popular Ideas💡 Usability increase using 2 screens for investigations 🔐login required Changing the default settings for Graphic Filters in Data Viewer 🔐login required Make updates to metadata.parquet optional 🔐login required In case you missed it 📣 Welcome to Quantexa 2.6 | 2.6.0 Release Announcement 🏆 Badge of the Month: The Name Dropper Badge Community quick links 🚀 Submit and vote for Ideas in our Ideas Portal 🗣️ Join one of our Specialist User Groups: FinCrime, Insurance, Data Management & KYC 📚️ Browse blogs, articles and guides in our Community Library
Stephanie_Richardson
2 years ago Place News & Announcements
145Views
1like
0Comments
Quantexa & Xander Talent - New Education Partnership
Driving the creation of COEs for customers and partners to enable them to leverage Decision Intelligence (DI) technology via accelerated learning programs and talent provision. Quantexa, a leader in Decision Intelligence (DI), and Xander Talent, a top digital talent consultancy, are excited to announce a strategic educational partnership. This collaboration marks a key milestone in driving industry innovation and talent development. Achieving unicorn status in early 2023, Quantexa is now enhancing its educational offerings to empower customers and partners, fostering self-sufficiency in the ever-evolving decision intelligence landscape. This partnership represents our commitment to providing cutting-edge educational resources and nurturing exceptional talent in the technology sector. Additionally, this new educational capability enhances our Quantexa Partner Program by adding a new partnership category, Quantexa Education Partner, for expanded knowledge share capabilities to partners. The Quantexa Academy is enhancing its educational programs through a partnership with Xander Talent. This collaboration is a key part of our continuous effort to strengthen partner support and increase customer self-sufficiency. It represents an important extension of the reach and effectiveness of Quantexa’s educational initiatives. We are excited to work with an innovative company like Xander Talent, confident that together, we will drive innovation for our customers and partners. Oliver Butler, Head of Education Services, Quantexa We are excited to partner with Xander to help us scale our strategic education initiatives. We see this partnership as a key contributor to driving customer and partner self-sufficiency and quickly mobilising decision intelligence centers of excellence. Christel Wolthoorn, CEO, Xander Talent Xander is thrilled to become a Quantexa Education Partner. This is a true partnership in every sense of the word, as Xander certified Academy Team Leads will be fully integrated into the Quantexa Academy program to work along Quantexa’s own Academy Team Leads and by doing so increase the value of the contribution that we make to the success of Quantexa’s customers and partners. Sheryl Wharff, VP Global Alliance Marketing, Quantexa We are thrilled to add Xander as our first Education Partner, expanding our Quantexa Partner Program announced in 2022. Providing new delivery methods, which enable our Quantexa Partners to learn to deliver our Decision Intelligence technology in a faster-time-to-value model, is the way we will further grow and expand our “Partner First” company objectives. 🤝Partnership Scope 💻💡 Scala & Spark Bootcamp Duration: 5 Days An instructor-led program designed to enable candidates with a background in programming to successfully complete the Scala and Spark Assessment, the pre-requisite entry point to the Data and Scoring Engineer Academies, within a 5-Day period. For further details about the Scala Prerequisite Assessment please see our Scala Prerequisite Assessment for Quantexa Certification Academies Article. 💻🏆 Data Engineer Velocity Program Duration: 30 Days This program is designed to enable candidates to successfully gain the Data Engineer Certification within a focused 30-Day period via a set of Velocity Workshops at key stages of the education program. 💡 Xander’s Talent Engine An additional partner enablement and customer self-sufficiency component of Xander’s business model is to provide a service that allows Xander’s top performing talents to be embedded within partner and customer teams – initially as consultants, and then, potentially, as full-time employees. In this way organisations can take advantage of Xander pre-certified Quantexa Data Engineers, Scoring Engineers, and Technical Business Analysts, to help them build out their Quantexa project teams. Resources Visit Xander Talent’s corporate website to explore their recruiting and training services that build skilled teams of digital talents tailored to your needs. Visit the article on the Quantexa Community website to learn more about the extensive range of training products offered by the Quantexa Academy. Visit Quantexa’s corporate website to learn how its class-leading Decision Intelligence Platform brings innovation and confidence in decision-making across various industries by utilizing contextual data.
Oliver_Butler
2 years ago Place News & Announcements
256Views
1like
0Comments
Spark Settings for Success
This page is set up to introduce spark-submit job settings and what considerations you should take when setting them. Please check out the Spark Documentation for all the possible ways to set your spark job: https://spark.apache.org/docs/latest/configuration.html Spark Settings Spark setting are highly dependent on datasource type and size and should be tuned separately. Initial ElasticLoad spark settings for 40 datanode cluster could look like this with the following settings: spark.drive.memory=20g spark.executor.instances=20 spark.executor.memory=20g spark.executor.cores=2 Description of Key Spark Settings The spark.driver.memory configuration in Apache Spark determines the amount of memory allocated to the Spark driver, which is the central control program for a Spark application. The driver is responsible for coordinating tasks, managing the overall execution of the application, and collecting results. The spark.executor.instances configuration in Apache Spark specifies the initial number of executor instances to allocate for a Spark application when it starts. Each executor instance represents a separate process that can run tasks in parallel. The spark.executor.memory setting determines how much memory each Spark executor has available for storing data, caching, and performing computations. The spark.executor.cores configuration in Apache Spark specifies the number of CPU cores to be allocated to each executor in a Spark application. It plays a crucial role in determining the degree of parallelism for your Spark tasks and impacts how your application utilizes the available CPU resources. Tips for Spark Settings It is advisable to maintain… Ratio 1:1 Number of Datanodes : NumberOfExecutors*ExecutorCores Some performance gains can be seen when ratio changes to Ratio 1:3 Number of Datanodes : NumberOfExecutors*ExecutorCores Example: 40 Datanode/spark.executor.instances=30*spark.executor.cores=4 (1:3) Increasing number of data nodes bring down total loading time and reduces bulk rejection error. Update Default Parallelism… Using spark.default.parallelism is a powerful way of tuning your spark job. This configuration defines the default number of partitions to be used for distributed data processing operations when the number of partitions is not explicitly specified. It plays a crucial role in determining the degree of parallelism for your Spark application. It should ideally be a multiple of the number of CPU cores in your cluster. If you enable dynamic allocation (below, Spark can adjust the number of partitions dynamically based on workload. In such cases, you may set a conservative initial value for parallelism. Operations that involve shuffling data between partitions (e.g., join , groupByKey ) often benefit from more partitions, as this can reduce the amount of data movement and improve performance. Consider the distribution of your data. Uneven data distribution can lead to load imbalance among partitions, affecting overall job performance. Adjusting the number of partitions can help address such issues. Enable Dynamic Allocation… The spark.dynamicAllocation.enabled is a boolean configuration option that determines whether dynamic allocation of executor resources is enabled or disabled for a Spark application. Dynamic allocation allows Spark to adjust the number of executor instances and their resources (CPU cores and memory) based on the workload and resource demands of the application. When set to true , dynamic allocation is enabled, allowing Spark to add or remove executor instances dynamically as needed during the course of the application's execution. When set to false , dynamic allocation is disabled, and the number of executor instances remains fixed throughout the application's lifetime, as determined by the initial configuration. Note: This will scale to take all available resources, so be careful when using if your project is very cost-conscious or if you have to share limited resources with other teams outside your own.
Clare_Jones
2 years ago Place Getting Started
209Views
0likes
1Comment
Read now: What it's Like to Perform an Upgrade
Read all about our experience and key takeaways when upgrading a repository from version 2.1 to 2.3. The upgrade was performed and released in early 2023 shortly after version 2.3 became available. The main motivation behind this particular upgrade was to upgrade the batch tier software including versions of EMR & Spark which Quantexa 2.3 supported. The flexibility of the new Data Viewer and other newly released features were also important value-adds. What's it Like to Perform an Upgrade? - Quantexa Community This blog describes the experience and key takeaways when upgrading a repository from version 2.1 to 2.3. The upgrade was performed and released in early 2023 shortly after version 2.3 became available. Why upgrade Quantexa versions? As with any software, the advantages of upgrading to newer versions include: Access to new…
Max_Mills
2 years ago Place Getting Started
111Views
0likes
2Comments