South Korean company Upstage is advancing document processing by developing small language models, catering to specific industry needs and promising enhanced accuracy.
In a significant development in the field of artificial intelligence (AI), the South Korean company Upstage is making strides in enhancing document processing through the development of small language models (SLMs). Established initially to utilise optical character recognition (OCR) to aid large corporations in document management, Upstage has pivoted towards addressing the rising demand for advanced language models since the introduction of ChatGPT.
Located in Las Vegas for an exhibition at AWS re:Invent, Upstage’s co-founder and chief product officer, Lucy Park, discussed the company’s evolution in an interview. “Customers wanted a language model that was fit for their own use,” she said, highlighting the impetus behind the creation of SLMs. While Upstage had achieved an impressive 95% accuracy rate with its OCR technology, the company recognised that clients were seeking a solution that could deliver 100% accuracy, prompting the exploration of more suitable models for document processing tasks.
Despite limited attention surrounding SLMs compared to larger counterparts, their distinctive capabilities enable corporations to develop tailored language models that cater specifically to their requirements, whether these be industry or locale-specific. Upstage’s focus on SLMs allows them to concentrate on specialised applications, rather than the broad applicability found in large language models (LLMs). This targeted approach has potential ramifications for business practices, particularly in sectors that rely heavily on precise document processing.
The company is leveraging open-source models within its AI architecture, facilitating operations on a singular graphics processing unit (GPU). Their flagship model, known as Solar, competes with notable models like Llama 3.81 B and Hugging Face’s ExaOne3.0 7.8B Instruct. Park elaborated on the innovative technique of model merging, an approach gaining traction in the AI sector. This method involves integrating smaller LLMs into larger frameworks, resulting in models like combining a 7 billion parameter model with a 10 billion parameter model. “If we have a 14 billion model, we explode that into a 22 billion model,” she explained.
Research has indicated that model merging permits the construction of universal models without necessitating access to the extensive original training data or incurring significant computational costs. A collaborative paper from researchers at Nanyang Technological University, Northeastern University, and Sun Yat-sen University confirmed these advantages. Upstage has reported noteworthy enhancements in its benchmarks from this combined model strategy, with their Solar Pro model achieving a 64% improvement in language mastery in Eastern Asia compared to earlier iterations.
The company’s commitment to small language models is underscored by the specificity they can achieve. For instance, Upstage has successfully developed a model tailored for the Thai language, utilising methodologies comparable to those of the GPT-4 model from OpenAI. Moreover, the financial aspect of SLMs presents a compelling case for businesses; Park noted that the hypothetical cost of developing an SLM could be as low as $10, while a larger LLM could amount to approximately $100.
Customers have three primary deployment options for these models. They can choose to implement on-premise solutions via the Upstage console, utilising Application Programming Interfaces (APIs) through the AWS marketplace. Notably, the Solar Pro model has recently become available on the Amazon Bedrock Marketplace, further expanding its reach and accessibility for organisations seeking advanced document processing technologies.
Source: Noah Wire Services
- https://www.builtinsf.com/articles/upstage-raises-72m-series-b-20240416 – Corroborates Upstage’s funding and its focus on AI models, including the Solar LLM and Document AI, which are used for document processing and other business applications.
- https://www.kedglobal.com/artificial-intelligence/newsView/ked202404160012 – Provides details on Upstage’s Series B funding, its global expansion plans, and the performance of its AI models, including the Solar LLM and Document AI.
- https://www.upstage.ai/locations – Confirms the locations of Upstage’s offices, including Seoul and San Jose, which supports the company’s global presence and operations.
- https://www.kedglobal.com/artificial-intelligence/newsView/ked202404160012 – Details Upstage’s successful funding and its plans to expand globally, particularly in the US market, which aligns with the company’s evolution in AI technology.
- https://www.builtinsf.com/articles/upstage-raises-72m-series-b-20240416 – Mentions the use of optical character recognition (OCR) technology by Upstage and the transition towards more advanced language models like the Solar LLM.
- https://www.kedglobal.com/artificial-intelligence/newsView/ked202404160012 – Discusses the performance and capabilities of Upstage’s Solar LLM, including its availability on Amazon SageMaker JumpStart and its versatility across languages and tasks.
- https://www.kedglobal.com/artificial-intelligence/newsView/ked202404160012 – Highlights the financial and operational benefits of Upstage’s AI models, such as the cost-efficiency and the ability to develop fine-tuning models based on the Solar LLM.
- https://www.builtinsf.com/articles/upstage-raises-72m-series-b-20240416 – Mentions the involvement of key investors like SK Networks, KT, and Korea Development Bank, which supports the financial aspect of Upstage’s operations.
- https://www.kedglobal.com/artificial-intelligence/newsView/ked202404160012 – Details the deployment options for Upstage’s models, including on-premise solutions and APIs through the AWS marketplace, aligning with the company’s commitment to accessibility.
- https://www.kedglobal.com/artificial-intelligence/newsView/ked202404160012 – Confirms the availability of the Solar Pro model on the Amazon Bedrock Marketplace, expanding its reach for advanced document processing technologies.