Word Meaning and Word2Vec Trailhead: Construct_Examples method

I feel like I must be mising something with the code provided for this method. Near the beginning of the method the instructions tell you to create a window of size 'n'.

Later on, you pop the center word out of the window so I believe the length of the window would now be n-1.

At the end of the CBOW senction there is the following if statement:
if len(window)<n:
continue

It seems to me that len(window) is always n-1 at this point. As a result, you continue and thus never add anyting to the examples list so the CBOW samples end up zero.

What am I missing???

Tom

March 15, 2019
·
Answer
·
Like
0
·
Follow
2

Deepali Kulshrestha
Hi Tom,

Greetings to you!

Use this code :-

def construct_examples(numericalized_sentences, vocabulary, num_examples=int(1e6), n=5, sg=True, k=0):
examples = []
while True:
# TODO: select a random sentence index using random.randint and get that
# sentence. Be careful to avoid indexing errors.
sentence_idx = random.randint(0,len(numericalized_sentences)-1)
sentence = numericalized_sentences[sentence_idx]
# TODO: Select a random window index using random.randint
# and obtain that window of size n. Be careful to avoid indexing errors.
window_idx = random.randint(0,len(sentence)-1)
window = sentence[window_idx:n]

if len(window) <= n//2:
continue

# TODO: Get the center word and the context words
center_word = window[int(round(len(window)/2))]
context_words = window
context_words.remove(center_word)

# TODO: Create examples using the guidelines above
if sg: # if Skip-Gram
context_word = context_words[random.randint(0, len(context_words)-1)]
example = [center_word, context_word]
else: # if CBOW
example = [context_words, center_word]
if len(window) < n:
continue

if k > 0: # if doing negative sampling
samples = [random.randint(0, len(vocabulary.index_to_word)-1)
for _ in range(k)]
example.append(samples)

examples.append(example)
if len(examples) >= num_examples:
break

return examples

I hope you find the above solution helpful. If it does, please mark as Best Answer to help others too.

Thanks and Regards,
Deepali Kulshrestha

March 29, 2019
·
Like
0
·
Dislike
0

Tom Flynn 19
Thanks Deepali. I probably will not get back to this until late this week but I appreciate the help.

Sincerely,

Tom

April 2, 2019
·
Like
0
·
Dislike
0

Atif Razzaq
Hello Deepali

I am stuck on this trail as well, for Question 9 I am getting 39 which isn't one of the listed answers. Following in my code snippet:

sentence_idx = random.randint(0,len(numericalized_sentences)-1) sentence = numericalized_sentences[sentence_idx] # TODO: Select a random window index using random.randint # and obtain that window of size n. Be careful to avoid indexing errors. window_idx = random.randint(0,len(sentence)-1) window = sentence[window_idx:window_idx+n] if len(window) <= (n//2): continue # TODO: Get the center word and the context words center_word = sentence[window_idx+(n//2)] context_words = sentence[window_idx:window_idx+(n//2)] + sentence[window_idx+(n//2)+1:window_idx+n]

I am struggling with "Word2VecModel(nn.Module)" class definition and "def train(...)". Can someone please kindly share the code? Hints and instructions given in the code are not clear and aren't helpful.

Thanks anticipation!!!

December 2, 2019
·
Like
0
·
Dislike
0

You need to sign in to do that.

Need an account? Sign Up

Have an account? Sign In

Dismiss

Browse by Topic

Welcome to Support!

Show

sorted by

Word Meaning and Word2Vec Trailhead: Construct_Examples method

You need to sign in to do that.