publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2024
- arXivRakutenAI-7B: Extending Large Language Models for JapaneseRakuten Group, Inc., Aaron Levine, Connie Huang, and 27 more authors2024
We introduce RakutenAI-7B, a suite of Japanese-oriented large language models that achieve the best performance on the Japanese LM Harness benchmarks among the open 7B models. Along with the foundation model, we release instruction- and chat-tuned models, RakutenAI-7B-instruct and RakutenAI-7B-chat respectively, under the Apache 2.0 license.
@misc{rakutengroup2024rakutenai7b, title = {RakutenAI-7B: Extending Large Language Models for Japanese}, author = {{Rakuten Group, Inc.} and Levine, Aaron and Huang, Connie and Wang, Chenguang and Batista, Eduardo and Szymanska, Ewa and Ding, Hongyi and Chou, Hou Wei and Pessiot, Jean-François and Effendi, Johanes and Chiu, Justin and Ohlhus, Kai Torben and Chopra, Karan and Shinzato, Keiji and Murakami, Koji and Xiong, Lee and Chen, Lei and Kubota, Maki and Tkachenko, Maksim and Lee, Miroku and Takahashi, Naoki and Jwalapuram, Prathyusha and Tatsushima, Ryutaro and Jain, Saurabh and Yadav, Sunil Kumar and Cai, Ting and Chen, Wei-Te and Xia, Yandi and Nakayama, Yuki and Higashiyama, Yutaka}, year = {2024}, eprint = {2403.15484}, publisher = {arXiv}, url = {https://arxiv.org/abs/2403.15484}, archiveprefix = {arXiv}, primaryclass = {cs.CL}, }
2022
- ICCLRakuten’s Participation in WAT 2022: Parallel Dataset Filtering by Leveraging Vocabulary HeterogeneityAlberto Poncelas, Johanes Effendi, Ohnmar Htun, and 3 more authorsIn Proceedings of the 9th Workshop on Asian Translation, Oct 2022
This paper introduces our neural machine translation system’s participation in the WAT 2022 shared translation task (team ID: sakura). We participated in the Parallel Data Filtering Task. Our approach based on Feature Decay Algorithms achieved +1.4 and +2.4 BLEU points for English to Japanese and Japanese to English respectively compared to the model trained on the full dataset, showing the effectiveness of FDA on in-domain data selection.
@inproceedings{poncelas-etal-2022-rakutens, title = {Rakuten{'}s Participation in {WAT} 2022: Parallel Dataset Filtering by Leveraging Vocabulary Heterogeneity}, author = {Poncelas, Alberto and Effendi, Johanes and Htun, Ohnmar and Yadav, Sunil and Wang, Dongzhe and Jain, Saurabh}, booktitle = {Proceedings of the 9th Workshop on Asian Translation}, month = oct, year = {2022}, address = {Gyeongju, Republic of Korea}, publisher = {International Conference on Computational Linguistics}, url = {https://aclanthology.org/2022.wat-1.7/}, pages = {68--72}, }
2021
- ACLRakuten’s Participation in WAT 2021: Examining the Effectiveness of Pre-trained Models for Multilingual and Multimodal Machine TranslationRaymond Hendy Susanto, Dongzhe Wang, Sunil Yadav, and 2 more authorsIn Proceedings of the 8th Workshop on Asian Translation (WAT2021), Aug 2021
This paper introduces our neural machine translation systems’ participation in the WAT 2021 shared translation tasks (team ID: sakura). We participated in the (i) NICT-SAP, (ii) Japanese-English multimodal translation, (iii) Multilingual Indic, and (iv) Myanmar-English translation tasks. Multilingual approaches such as mBART (Liu et al., 2020) are capable of pre-training a complete, multilingual sequence-to-sequence model through denoising objectives, making it a great starting point for building multilingual translation systems. Our main focus in this work is to investigate the effectiveness of multilingual finetuning on such a multilingual language model on various translation tasks, including low-resource, multimodal, and mixed-domain translation. We further explore a multimodal approach based on universal visual representation (Zhang et al., 2019) and compare its performance against a unimodal approach based on mBART alone.
@inproceedings{susanto-etal-2021-rakutens, title = {Rakuten{'}s Participation in {WAT} 2021: Examining the Effectiveness of Pre-trained Models for Multilingual and Multimodal Machine Translation}, author = {Susanto, Raymond Hendy and Wang, Dongzhe and Yadav, Sunil and Jain, Mausam and Htun, Ohnmar}, editor = {Nakazawa, Toshiaki and Nakayama, Hideki and Goto, Isao and Mino, Hideya and Ding, Chenchen and Dabre, Raj and Kunchukuttan, Anoop and Higashiyama, Shohei and Manabe, Hiroshi and Pa, Win Pa and Parida, Shantipriya and Bojar, Ond{\v{r}}ej and Chu, Chenhui and Eriguchi, Akiko and Abe, Kaori and Oda, Yusuke and Sudoh, Katsuhito and Kurohashi, Sadao and Bhattacharyya, Pushpak}, booktitle = {Proceedings of the 8th Workshop on Asian Translation (WAT2021)}, month = aug, year = {2021}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2021.wat-1.9/}, doi = {10.18653/v1/2021.wat-1.9}, pages = {96--105}, }
2012
- BSPDevelopment of Low Cost Electromyography (EMG) Controlled Prosthetic HandShailabh Suman, Sunil Kumar, and Pushparaj Mani PathakIn Mechatronic & Innovative Applications, 2012
Myoelectric signals are weak signals (order of micro-volts) generated by the muscular activity. The contraction of muscles can generate microvolt level electrical signals within the muscles. These signals can be realized on the surface of skin above the muscles or within the muscles using a needle. Based on this phenomenon, it was thought of implementing a myoelectric control to prosthetic hands which would use a surface electrode to receive myoelectric signals from disabled persons’ residual limb. A prothetic hand is a customized limb that can be fitted on a disabled persons’ residual limb to assist them in manipulation, which can be only performed by hands. This chapter demonstrates the principle of controlling the prosthetic hand by Surface EMG signals on a wooden prototype. The developed prosthetic hand also uses force sensor and proximity sensor to aid the EMG sensor for hybrid control. The chapter describes the various steps for development of low cost surface EMG controlled prosthetic hand.
@incollection{suman-etal-2012-iitroorkee, author = {Suman, Shailabh and Kumar, Sunil and Pathak, Pushparaj Mani}, title = {Development of Low Cost Electromyography (EMG) Controlled Prosthetic Hand}, booktitle = {Mechatronic \& Innovative Applications}, pages = {37--54}, year = {2012}, publisher = {Bentham Science Publishers}, doi = {10.2174/978160805441111201010037}, }