Librarian Bot: Update dataset YAML metadata for model

1d7f607 about 3 years ago

1.67 kB

	---
	datasets: Fraser/python-lines
	---
	# roberta_python
	---
	language: code
	datasets:
	- code_search_net
	- Fraser/python-lines
	tags:
	- python
	- code
	- masked-lm
	widget:
	- text "assert 6 == sum([i for i in range(<mask>)])"
	---
	# Details
	This is a roBERTa-base model trained on the python part of [CodeSearchNet](https://github.com/github/CodeSearchNet) and reached a dev perplexity of 3.296

	This model was used for the Programming Puzzles enumerative solver baseline detailed in [Programming Puzzles paper](https://arxiv.org/abs/2106.05784).

	See also the [Python Programming Puzzles (P3) Repository](https://github.com/microsoft/PythonProgrammingPuzzles) for more details.

	# Usage

	You can either load the model and further fine-tune it for a target task (as done for the puzzle solver), or you can experiment with mask-filling directly with this model as in the following example:

	```python
	from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline

	tokenizer = AutoTokenizer.from_pretrained("tals/roberta_python")
	model = AutoModelWithLMHead.from_pretrained("tals/roberta_python")

	demo = pipeline("fill-mask", model=model, tokenizer=tokenizer)

	code = """sum= 0
	for i in range(<mask>):
	sum += i
	assert sum == 6
	"""
	demo(code)
	```

	# BibTeX entry and citation info

	```bibtex
	@inproceedings{
	schuster2021programming,
	title={Programming Puzzles},
	author={Tal Schuster and Ashwin Kalyan and Alex Polozov and Adam Tauman Kalai},
	booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
	year={2021},
	url={https://openreview.net/forum?id=fe_hCc4RBrg}
	}
	```