torch.embedding and EmbeddingBag 详解-程序员宅基地

Embedding

torch.embedding 实际上是一个查找表，一般用来存储词嵌入并通过indices从embedding中恢复词嵌入。

位置：

torch.nn.Embedding

参数及官方解释为：

num_embeddings (int): size of the dictionary of embeddings
embedding_dim (int): the size of each embedding vector
padding_idx (int, optional) :If given, pads the output with the embedding vector at padding_idx (initialized to zeros) whenever it encounters the index.
max_norm ((float, optional)):If given, each embedding vector with norm larger than max_norm is renormalized to have norm max_norm.
norm_type (float, optional): The p of the p-norm to compute for the max_norm option. Default 2.
scale_grad_by_freq: If given, this will scale gradients by the inverse of frequency of the words in the mini-batch. Default False.
sparse (bool, optional) :If True, gradient w.r.t. weight matrix will be a sparse tensor. See Notes for more details regarding sparse gradients.

Attributes:

weight (Tensor): the learnable weights of the module of shape (num_embeddings, embedding_dim) initialized from :math:\mathcal{N}(0, 1)

Shape:

Input: :math:(*), LongTensor of arbitrary shape containing the indices to extract
Output: :math:(*, H), where * is the input shape and :math:H=\text{embedding\_dim}

Examples::

>>> # an Embedding module containing 10 tensors of size 3
>>> embedding = nn.Embedding(10, 3)
>>> # a batch of 2 samples of 4 indices each
>>> input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
>>> embedding(input)
tensor([[[-0.0251, -1.6902,  0.7172],
         [-0.6431,  0.0748,  0.6969],
         [ 1.4970,  1.3448, -0.9685],
         [-0.3677, -2.7265, -0.1685]],

         [[ 1.4970,  1.3448, -0.9685],
         [ 0.4362, -0.4004,  0.9400],
         [-0.6431,  0.0748,  0.6969],
         [ 0.9124, -2.3616,  1.1151]]])

可以看到当index 相同的时候输出的 embedding是相同的。例如第一个sample的index=2 和第二个sample的index=2,
也就是说对于同一个embedding, 输入的index相同，对应的tensor相同。

with padding

>>> # example with padding_idx
>>> embedding = nn.Embedding(10, 3, padding_idx=0)
>>> input = torch.LongTensor([[0,2,0,5]])
>>> embedding(input)
tensor([[[ 0.0000,  0.0000,  0.0000],
         [ 0.1535, -2.0309,  0.9315],
         [ 0.0000,  0.0000,  0.0000],
         [-0.1655,  0.9897,  0.0635]]])

当有padding的时候，例如设置padding_idx = 0,也就是当第一个index 和第三个index = 0时，输出的tensor 自动padding为0，而index=2和index=5没有设置padding，所以输出没有被0 padding。

其中的一个classmethod

@classmethod
   def from_pretrained(cls, embeddings, freeze=True, padding_idx=None,
                       max_norm=None, norm_type=2., scale_grad_by_freq=False, sparse=False):
	r"""Creates Embedding instance from given 2-dimensional FloatTensor.

                Args:
                    embeddings (Tensor): FloatTensor containing weights for the Embedding.
                        First dimension is being passed to Embedding as ``num_embeddings``, second as ``embedding_dim``.
                    freeze (boolean, optional): If ``True``, the tensor does not get updated in the learning process.
                        Equivalent to ``embedding.weight.requires_grad = False``. Default: ``True``
                    padding_idx (int, optional): See module initialization documentation.
                    max_norm (float, optional): See module initialization documentation.
                    norm_type (float, optional): See module initialization documentation. Default ``2``.
                    scale_grad_by_freq (boolean, optional): See module initialization documentation. Default ``False``.
                    sparse (bool, optional): See module initialization documentation.

                Examples::

                    >>> # FloatTensor containing pretrained weights
                    >>> weight = torch.FloatTensor([[1, 2.3, 3], [4, 5.1, 6.3]])
                    >>> embedding = nn.Embedding.from_pretrained(weight)
                    >>> # Get embeddings for index 1
                    >>> input = torch.LongTensor([1])
                    >>> embedding(input)
                    tensor([[ 4.0000,  5.1000,  6.3000]])
                """

EmbeddingBag

Computes sums or means of ‘bags’ of embeddings, without instantiating the intermediate embeddings.

支持的三种mode

sum：is equivalent to ~torch.nn.Embedding followed by torch.sum(dim=0)
mean：is equivalent to ~torch.nn.Embedding followed by torch.mean(dim=0)
max：is equivalent to ~torch.nn.Embedding followed by torch.max(dim=0)
但是用embeddingbag 的效率会更高。
pytorch支持在forward pass 中增加 per-sample weights，但只在 mode == sum时支持。如果这个参数为0，在计算 weighted sum的时候所有的weight = 1，如果不为0，则按照设置的weight来计算weighted sum。
其他参数和 embedding差不多

本文链接：https://blog.csdn.net/weixin_46559271/article/details/106356155

原作者删帖不实内容删帖广告或垃圾文章投诉

智能推荐

linux笔记本没有串口,用笔记本在linux下进行串口编程-程序员宅基地

文章浏览阅读205次。由于用的是本本(无串口)，之前看了别的minicom配置，怎么都不对，后来参考了一篇文章，这下我的本本也可以在linux下进行开发啦。执行下面命令aiklo@aiklo-laptop:~$ dmesg | grep usb[ 24.855373] usbcore: registered new interface driver usbfs[ 24.855399] usbcore: regi..._linux可以不通过串口工具自行运行脚本并存储吗

基于YOLOv7开发构建MSTAR雷达影像目标检测系统_mstar数据集-程序员宅基地

文章浏览阅读3.4k次，点赞3次，收藏11次。基于YOLOv7开发构建MSTAR雷达影像目标检测系统_mstar数据集

mac配置JDK和Maven那些事_arm64 dmg installer arm64 compressed archive-程序员宅基地

文章浏览阅读1.2k次，点赞12次，收藏29次。mac配置java环境的步骤和问题小记，本人亲测好用_arm64 dmg installer arm64 compressed archive

【论文解读】【论文翻译】SAST文字检测算法_sast算法-程序员宅基地

文章浏览阅读4.6k次，点赞9次，收藏23次。A Single-Shot Arbitrarily-Shaped Text Detector based on，Context Attended Multi-Task Learning百度自研文字检测算法，实际上就是EAST算法的扩展，一阶段，输出为multitask，各个分支相互校正。_sast算法

BUUCTF-bjdctf_2020_babystack2-WP-程序员宅基地

文章浏览阅读137次。只有NX保护。_bjdctf_2020_babystack2

2022-06-30 Android app WakeLock息屏状态下唤醒屏幕并且解锁demo_自动唤醒屏幕并解锁-程序员宅基地

文章浏览阅读2.5k次。一、解锁二、获取电源锁，保持该服务在屏幕熄灭时仍然能唤醒三、权限申请四、各种锁的类型对CPU 、屏幕、键盘的影响：PARTIAL_WAKE_LOCK:保持CPU 运转，屏幕和键盘灯有可能是关闭的。SCREEN_DIM_WAKE_LOCK：保持CPU 运转，允许保持屏幕显示但有可能是灰的，允许关闭键盘灯SCREEN_BRIGHT_WAKE_LOCK：保持CPU 运转，允许保持屏幕高亮显示，允许关闭键盘灯FULL_WAKE_LOCK：保持CPU 运转，保持屏幕高亮显示，键盘灯也保持亮度ACQU_自动唤醒屏幕并解锁

随便推点

第九章动态规划-1306：最长公共子上升序列_1306:最长公共子上升序列-程序员宅基地

文章浏览阅读2.1k次，点赞2次，收藏5次。1306：最长公共子上升序列时间限制: 1000 ms 内存限制: 65536 KB提交数: 1808 通过数: 1006【题目描述】给定两个整数序列，写一个程序求它们的最长上升公共子序列。当以下条件满足的时候，我们将长度N的序列S1,S2,…,SN 称为长度为M的序列A1,A2,…,AM的上升子序列：存在1≤i1<i2<…<iN≤M，使得对所..._1306:最长公共子上升序列

解决Vue前端请求 SpringBoot 后台跨域 session 为空的问题！！！（踩坑经验）_前端的请求里没有session-程序员宅基地

文章浏览阅读2.1k次，点赞5次，收藏6次。前后端分离，就没有会话（session）这个概念了！！！每次请求都是一个新的会话。业务场景在写登陆接口的时候，用户登陆验证完用户信息后使用jwt生成token,将token 存入session并返回给前端，让前端在后面的请求过程(请求头中)都带着token来请求接口。而我将使用过滤器拦截所有的请求，获取前端传过来的token，进行验证（1，验证token是否有效，2，验证token 是否与session 中的token 一致）。这时问题就出现了，获取session 中的token 为 null。_前端的请求里没有session