保持游戏动画的平滑 -- 使用固定逻辑帧速率

lanche · 发表于 2004-7-31 14:59:06

也许国内早就有了这些东西，但我并不知道，因为我对于Linux游戏编程的学习刚刚起步(也没有其它平台如win下游戏编程的经验)，发现所遇到的困难远远超过了自己的想象，之后才知道象这种基础理论对于游戏等软件的开发是非常重要的。一下子翻译不完，现在将资料帖出来，有能力的朋友可以帮忙翻译，我随时在帖子里更改。

资料如下：
[code:1]
  固定逻辑帧速率演示
( a fixed logic frame rate demo )
----------------------------------------------
这个SDL程序实例 -  一个简单的平台游戏 -
演示了固定虚拟帧速率和插补的用法，其平滑
和精确的游戏逻辑独立于实际渲染的帧速率。
( This SDL programming example - a simple
   platform game - demonstrates the use of
   a  fixed virtual  logic frame  rate
   together with interpolation, for smooth
   and accurate game  logic that  is
   independent of the rendering frame rate. )

   这个实例还演示了卡通动画和局部显示更新
   技巧，适用于游戏和需要高帧速率的应用程
   序而不须在每一帧中更新整个屏幕。
( The example  also  demonstrates  sprite
   animation  and partial display updating
   techniques, suitable  for  games  and
   applications that need high frame rates
   but can do  without updating  the whole
   screen every frame. )

  固定逻辑帧速率
( Fixed Logic Frame Rate )
----------------------
有一个固定逻辑帧速率意味着游戏逻辑(即按照对象行为
和用户操作定义的游戏规则)在每单位时间内以固定的次
数运行，这使得用"帧数"当作时间单位成为可能。
( Having a fixed logic frame rate means that the game
logic (that is, what defines the gameplay in terms
of object behavior and user input handling) runs a
fixed number of times per unit of time. This makes
it possible to use "frame count" as a unit of time. )

更有趣的是，一旦逻辑帧速率被设为一个足够大的数值
(比如说，20Hz的慢节奏游戏却用了100Hz的迅速动作)
逻辑代码还是会精确地每帧运行一次。
因此，它不需要计算时间差、解决平衡、考虑速度，加
速、突发、阻塞等等。
  ( More interestingly, since the logic frame rate
can be set at any sufficient value (say, 20 Hz for
a slow turn based game, or 100 Hz for fast action)
the logic code will run exactly once per frame.
Thus, there is no need to take delta times in
account, solving equations, making calculations on
velocity, acceleration, jerk and stuff like that. )

You can just deal with hardcoded "step" values and
simple tests.

Perhaps most importantly, you can *still* rely
on the game behaving *exactly* the same way,
regardless of the rendering frame rate or other
system dependent parameters - something that is
virtually impossible with delta times, since you
cannot have infinite accuracy in the calculations.

  虚拟逻辑帧速率
( Virtual Logic Frame Rate )
------------------------

By "virtual", I mean that the actual frame rate is
not necessarily stable at the nominal value at all
times. Rather, the *average* logic frame rate is
kept at the nominal value by means of controlling
the number of logic frames processed for each
rendered frame.

也就是说，如果实际渲染速率低于所谓的逻辑帧速率，
引擎会在实际渲染之前运行若干次游戏逻辑，所以游戏
逻辑可以在几帧之间一下子就运行了几十kHz,但这不会
有什么问题，因为游戏逻辑代码完全依赖于逻辑时间。
  ( That is, if the rendering frame rate is lower
than the nominal logic frame rate, the engine will
run the game logic several times before rendering
each frame. Thus, the game logic may actually be
running at tens of kHz for a few frames at a time,
but this doesn't matter, as long as the game logic
code relies entirely on logic time. )

所以请不要在游戏逻辑代码中使用诸如SDL_GetTicks()之
类的来读取时间!应改为只计算逻辑帧数，就象回到C64和
Amiga的时代，视频帧就是可靠的时间单位，它确实有效!
(  So, do not try to read time using SDL_GetTicks()
or similar in the game logic code! Instead, just
count logic frames, like we did back in the C64 and
Amiga days, where video frames were actually a
reliable time unit. It really works! )

  采样失真
( Resampling Distortion )
---------------------
现在，固定逻辑帧速率存在一个问题：采样失真(表现在
播放声音波形时会在一定程序上导致声音引擎产生尖叫
和振鸣声)。
( Now, there is one problem with fixed logic frame
rates: Resampling distortion. (The same phenomena
that cause poor audio engines to squeal and feep
when playing back waveforms at certain pitches.) )

对象的坐标由游戏逻辑引擎产生，可以看作是具有固定
采样率的信号描述数值流(按照电子工程/数字信号处理
标准)，每个坐标就是一个流。
(  The object coordinates generated by the game
logic engine can be thought of as streams of values
describing signals (in electrical engineering/DSP
terms) with a fixed sample rate. Each coordinate
value is one stream. )

一旦逻辑帧速率固定了，游戏逻辑也会在每次渲染中以
正整数次运行，而我们得到的是从逻辑帧速率到渲染帧
速率"最近点"的采样率，这样不是太好，因为每次运行
逻辑帧后实际上只有最后一套坐标被使用 - 其余的都
被丢弃了!
(  Since the logic frame rate is fixed, and the
game logic runs an integer number of times per
rendered frame, what we get is a "nearest point"
resampling from the logic frame rate to the
rendering frame rate. That's not very nice, since
only the last set of coordinates after each run of
logic frames is actually used - the rest are thrown
away! )

更糟的是，特别是当渲染帧速率高于逻辑帧速率时，你只
能间或地取得新坐标。
(  What's maybe even worse, especially if the logic
frame rate is low, is that you get new coordinates
only every now and then, when the rendering frame
rate is higher than the logic frame rate. )

平滑动画
( Getting Smooth Animation )
------------------------
那我们该怎么做? 好，正如我曾提示的，很显然答案就是：
插值! 我们只需用更好的方法来取代基本的"最近采样"。
( So, what do we do? Well, given my hint above, the
answer is probably obvious: interpolation! We just
need to replace the basic "nearest sample" method
with something better. )

采样于声音领域来说是一门学问和艺术，论述这个主题的
书籍可说是不计其数，而其中的大部分对于任何一个没有
取得数学学位的人来说是相当费解的。
(  Resampling is a science and an art in the audio
field, and countless papers have been written on
the subject, most of which are probably totally
incomprehensible for anyone who hasn't got a degree
in maths. )

然而，我们可以通过高度可控(如按预期的渲染帧速率来安
排)的逻辑帧速率来保持较低采样率的需求 - 这么做通常
都是为了减少潜在的游戏控制。
(  However, our requirements for the resampling can
be kept reasonably low by keeping the logic frame
rate reatively high (ie in the same order of
magnitude as the expected rendering frame rate) -
and we generally want to do that anyway, to reduce
the game's control latency. )

选择插值器
( Chosing An Interpolator )
-----------------------
实际渲染帧速率总是以不可预料的方式不断变化，我们不
得不为每次渲染帧时重复计算采样过滤的输入/输出比率。
( Since the rendering frame rate can vary constantly
in unpredictable ways, we will have to recalculate
the input/output ratio of the resampling filter for
every rendered frame. )

不过我们也不用特意去做什么，仅用一个多项式插值器
(作为FIR采样过滤的对抗)就能解决了，只需把坐标和期
望的帧时间片注入插值器，就可以得到计算好的坐标值。
(  However, using a polynomial interpolator (as
opposed to a FIR resampling filter), we can get
away without actually doing anything special. We
just feed the interpolator the coordinates and the
desired fractional frame time, and get the
coordinates calculated. )

DSP people will complain that a polynomial
resampler (that is, without a brickwall filter, or
oversampling + bandlimited downsampling) doesn't
really solve the whole problem. Right, it doesn't
remove frequencies above Nyqvist of the rendering
frame rate, so those can cause aliasing distortion.
But let's consider this:
Do we actually *have* significant amounts of
energy at such frequencies in the data from the
game logic? Most probably not! You would have to
have objects bounce around or oscillate at insane
speed to get anywhere near Nyqvist of (that is, 50%
of) any reasonable (ie playable) rendering frame
rate. In fact, we can probably assume that we're
dealing with signals in the range 0..10 Hz. Not
even the transients caused by abrupt changes in
speed and direction will cause visible side
effects.

因此在这个程序例子中我只用了一个简单的线性插值器，
没有过滤器，也没有过密采样或其它什么东西。尽管它
如此简单，但还是比"最近采样"有着难以置信的改进，
你可以在运行例子时使用F1键来启用/关闭插值。
( So, in this programming example, I'm just using
a simple linear interpolator. No filters, no
oversampling or anything like that. As simple as it
gets, but still an incredible improvement over
"nearest sample" resampling. You can enable/disable
interpolation with the F1 key when running the
example. )

卡通渲染
(Rendering Sprites)
-----------------
为了涵盖与卡通动画相关的FAQ，本例还包括了屏幕的
"智能"局部更新，只有受到移动和活动卡通影响的区域
才会被更新。
( In order to cover another animation related FAQ,
this example includes "smart" partial updates of
the screen. Only areas that are affected by moving
and/or animated sprites are updated. )

为了保持事情简单和易于实现，更新通过删除所有的卡
通来完成，更新它们的坐标和活动帧，然后重新渲染所
有的卡通。每帧都如此，包括所有的卡通，不管它们
是否移动。
( To keep things simple and not annoyingly non-
deterministic, updates are done by removing all
sprites, updating their positions and animation
frames, and then rendering all sprites. This is
done every frame, and includes all sprites, whether
they move or not. )

那为什么不只对确实移动的卡通进行更新呢?
(  So, why not update only the sprites that
actually moved?  )
That would allow for cheap but
powerful animated "backgrounds" and the like.

问题是卡通可能重叠，此时它们会互相拖入更新循环，
导致递归和潜在的依赖回环，一个非递归的两
(标记+渲染)算法可能是比实际循环好得多是主意，这
确实可行而且极为巧妙，???
(  Well, the problem is that sprites can overlap,
and when they do, they start dragging each other
into the update loop, leading to recursion and
potentially circular dependencies. A non-recursive
two-pass (mark + render) algorithm is probably a
better idea than actual recursion. It's quite
doable and neat, if the updates are restricted by
clipping - but I'll leave that for another example.
Pretty much all sprites in Fixed Rate Pig move all
the time, so there's nothing to gain by using a
smarter algorithm. )

高效软件渲染
(Efficient Software Rendering)
----------------------------
为了更有趣些我还为卡通抗变形和特效添加了alpha渲
染，多数的2D图形API和驱动(SDL几乎都支持)都缺乏
alpha渲染blits的h/w加速，这意味着CPU不得不执行
渲染任务，那可是昂贵的花费，但SDL软件blit器相当
快，使得这不再是个问题。
(To make it a bit more interesting, I also added
alpha blending for sprite anti-aliasing and effects.
Most 2D graphics APIs and drivers (and as a result,
most SDL backends) lack h/w acceleration of alpha
blended blits, which means the CPU has to perform
the blending. That's relatively expensive, but
SDL's software blitters are pretty fast, and it
turns out *that's* usually not a problem. )

但是还有一个问题：alpha渲染需要数据从目标平面读出，
修改，然后回写，不幸的是现有显卡操作CPU从VRAM读取
数据差得可怜，CPU读的带宽 - 甚至是在最牛的AGP 8x
显卡 - 也只不过算是老牛拉破车(我不是在开玩笑!)。
(  However, there is one problem: Alpha blending
requires that data is read from the target surface,
modified, and then written back. Unfortunately,
modern video cards handle CPU reads from VRAM very
poorly. The bandwidth for CPU reads - even on the
latest monster AGP 8x card - is on par with that of
an old hard drive. (I'm not kidding!)  )

这正是我想要示范如何解决这个问题的原因，就是要渲染
到s/w后台缓冲而不是h/w显示平面。如果你的系统可支持
硬件显示平面，就可以在游戏中按F2键来启用或禁止VRAM
直连渲染来看到不一样的效果。
( This is why I wanted to demonstrate how to avoid
this problem, by rendering into a s/w back buffer
instead of the h/w display surface. If you're on a
system that supports hardware display surfaces, you
can see the difference by hitting F2 in the game,
to enable/disable rendering directly into VRAM. )

SDL确实可以这么设置，但这样的话你只能使用单缓冲显示
了 - 而我们并不希望如此! 单缓冲显示不能跟踪同步动画，
结果是我们挖尽CPU潜力(即只出帧而不阻塞)仍然得不到平
滑的动画。
( Indeed, SDL can set that up for you, but *only*
if you ask for a single buffered display - and we
do NOT want that! Single buffered displays cannot
sync animation with the retrace, and as a result,
we end up hogging the CPU (since we never block,
but just pump out new frames) and still getting
unsmooth animation. )

意外的是，使用s/w后台缓冲进行混合相当好，几近于
局部更新策略，因而完全可以胜任。
(  Accidentally, this approach of using a s/w back
buffer for rendering mixes very well with partial
update strategies, so it fits right in. )

智能脏区管理
(Smart Dirty Rectangle Management)
--------------------------------
操作中最复杂的部分应该是要精确记录屏幕更新区域的
轨迹，只为卡通维护一个矩形区域还不够，一个卡通要
被删除，移动然后又再重画，至少需要向屏幕推入两个
矩形区域：一个用于删除旧的卡通，另一个则用于显示
其新的位置。
(The most complicated part of this implementation
is keeping track of the exact areas of the screen
that need updating. Just maintaining one rectangle
per sprite would not be sufficient. A moving sprite
has to be removed, animated and then re-rendered.
That's two rectangles that need to be pushed to the
screen; one to remove the old sprite image, and one
for the new position. )

在双缓冲显示中，这会变得更糟，因为渲染会在两个交互
的缓冲区中完成，当更新缓冲区时，旧的卡通一下子变成
了两个 - 而不是一个。
(  On a double buffered display, it gets even worse,
as the rendering is done into two alternating
buffers. When we update a buffer, the old sprites
in it are actually *two* frames old - not one. )

我实现了一个"智能"矩形区域合并的算法，可以在高层面
上的很小支持就能摆平所有问题。这个算法合并矩形区域
以最小化blit时和更新屏幕时的重画和矩形区域数量。你
可以阅读dirtyrects.txt文件以获得更详细的信息，并在
游戏中按F3键看看到底是怎么回事。就象这样：
( I've chosen to implement a "smart" rectangle
merging algorithm that can deal with all of this
with a minimum of support from higher levels. The
algorithm merges rectangles in order to minimize
overdraw and rectangle count when blitting to and
updating the screen. See the file dirtyrects.txt for
details. You can (sort of) see what's going on by
hitting F3 in the game. Here's what's going on: )

1、所有卡通从缓冲区中移除，所需信息可以从保存
   插值结果的变量中找到。
( 1. All sprites are removed from the rendering
   buffer. The required information is found
   in the variables that store the results of
   the interpolation. )

2、显示平面的脏区表被交换到工作脏区表，清除显
   示平面脏区表。
( 2. The dirtyrect table for the display surface
   is swapped into a work dirtyrect table. The
   display surface dirtyrect table is cleared. )

3、计算新的图像坐标，所有卡通渲染到当前渲染缓
   冲区，边界矩形区注入显示平面脏区表。
( 3. New graphic coordinates are calculated, and
   all sprites are rendered into the rendering
   buffer. The bounding rectangles are fed
   into the display surface dirtyrect table. )

4、经过步骤3编译的脏区表合并到工作脏区表，其结
   果包括了删除旧卡通和产生新可视卡通的所有需
   要更新的区域。
( 4. The dirtyrect table compiled in step 3 is
   merged into the work dirtyrect table. The
   result covers all areas that need to be
   updated to remove old sprites and make the
   new ones visible. )

5、第4步骤编译好的脏区表从渲染缓冲区传送到显示
   平面。
( 5. The dirtyrect table compiled in step 4 is
   used to blit from the rendering buffer to
   the display surface. )

使用双缓冲显示时，每个显示页各自拥有一个脏区表，
很显然在步骤5之后要有一个页面翻转的操作，但其它
地方算法是一样的。
( On a double buffered display, there is one
dirtyrect table for each display page, and there
is (obviously) a page flip operation after step 5,
but other than that, the algorithm is the same. )
[/code:1]

sagaeon · 发表于 2004-7-31 16:16:36

不错啊

lanche · 发表于 2004-7-31 16:55:35

已经翻译90%了，不过我现在得赶去值班，等明天再帖罢。

lovewilliam · 发表于 2004-7-31 17:11:46

嗯，不错

sagaeon · 发表于 2004-7-31 18:40:19

几个自己的意见，与lanche讨论一下。

lanche兄把resample译成采样了。resample是信号与图像处理中的一个专业术语(重采样)，有许多这样的函数与工具。
interpolate中国一般叫插值。
我看原文没有说是跨平台游戏啊，只有一个平台(platform)。这应该是一种许多平台的跳上跳下的游戏，早年的电视卡游戏很常见的那种。

sjinny · 发表于 2004-8-2 11:00:49

1.其实……我觉得整屏刷新应该不会如想像中那么效率低下……想想看，一个场景里，可能大多数的地方都是要刷新的，背景地图、建筑、人物……很多都有动画，这时采用局部刷新反而意义不大了……
2.我的困惑是……有时精灵是一个像素一个像素地移动的，结果看起来还是一跳一跳的……

sagaeon · 发表于 2004-8-2 13:20:03

sjinny说的精灵是sprite吧。我觉得译成 “子图像”好点。A movable object(MOB) is a graphical object that is manipulated separately from the background. These are seen all the time in computer games. When implemented in hardware, MOBs are sometimes called sprites.

mozilla · 发表于 2004-8-2 13:45:18

[quote:f1edfceaf0="sjinny"]
2.我的困惑是……有时精灵是一个像素一个像素地移动的，结果看起来还是一跳一跳的……[/quote]
我也有这个困惑,我在gdk窗口上移动一个位图时,总感觉不流畅,我也是一次走一个像素,是不是gdk的刷新性能比较差?

忘 · 发表于 2004-8-2 13:48:46

一跳一跳的，正好做僵尸

mozilla · 发表于 2004-8-2 13:59:04

问题是僵尸跳起来也不流畅,跳中有跳.

lanche · 发表于 2004-8-2 22:46:38

首先感谢sagaeon提出的意见，platform、interpolate分别为平台和插值，现已修正，至于resample，采样本来就是重重复复进行的，所以也可只称采样。sprite我译成卡通，这是多年前我在练习写任天堂8位机的游戏时专业书籍就已经是这么译的了。

由于英文够菜以及相关专业知识有限，能够翻译出大部分意思我就已经满足了，还希望大家多多指正。

请sjinny和mozilla相信我，之所以没能达到平滑动画的效果是因为没有掌握正确的方法。我要要说的是，局部刷新一般都会比全屏刷新有效；SDL和gdk的刷新效率也并不低!可是为什么在我们手里却有严重的延迟或闪烁呢(我想你们也试用过我的坦克大战demo，尽管每个周期只更新32x32象素的小区块，却仍有些闪烁)，可是看外国人写的游戏，却一个个运行得很流畅，即使是放到虚拟机的Linux上还是运行得很顺畅(我的几个游戏在本机上运行得正常，可是到虚拟机上就巨慢如牛了)，可见实现这些需要掌握正确的方法以及它的重要性。

而我现在所帖的资料，正是与此相关的东西。我正在学习这些诸如游戏引擎呀、固定帧速率呀等等基础理论知识，以掌握规范。也许在Win平台下这些知识早就不稀罕了，可是在我们这些自认Linux老鸟的人当中，却有多少人通晓呢? 目前我正学习的这个游戏是老外写的，原本只有四关，我正在制作更多的关卡，过几天再放出来与大家分享罢。

对了，电脑报最近几期的报纸在介绍的游戏编程中就连载了一些有用的东西，虽然是win的，可是地图编辑器的制作、游戏引擎、状态机等基本理论却是完全通用的，大家可学学，我这边研究的东西到时也会跟你们谈谈。

lanche · 发表于 2004-8-2 22:47:53

发多了。

sjinny · 发表于 2004-8-2 23:07:36

恩……我也感觉老外写的游戏……我对比了supertux和我的demo，发现supertux十分流畅，没有闪烁和一跳一跳的现象……不过刚刚注意到，supertux的图象好像是……隔行的~仔细看会看出图象里有一行一行的黑线，有点像电视机里的效果……不知道是不是因为这个而使显示更平滑呢？

不过……偶不准备再花太多精力在2D显示上了……用3D就没那么多烦恼啦~而且3D渲染可以用现成的引擎：OGRE(http://sourceforge.net/projects/ogre/)，物理引擎也有现成的：ODE(http://www.ode.org/)，它们都是开源的。自从我看了OGRE的一些资料后，我就决定不再自己开发渲染引擎了……没法和人家竞争啊……所以我现在主要还是做些总体的架构设计和弄弄服务器端的东东……希望将来能做出一个MMORPG来，而且这个网游将是开放的：不仅是开放源代码，主要是……要建立一个机制，让玩家参与剧情和任务的设计，有的是在游戏过程中“创造”剧情(比如惹事生非

)有的则是让玩家自己写一些脚本放到网站上让玩家们投票来确定是否采用……总之如果这样的游戏真的成功地做出来了，准确的说这个游戏引擎如果真的成功的做出来了，那么以后的网游卖的就不仅是服务，还要在剧情等方面竞争……

不过……前天……我凌晨的时候难得有心情写代码……结果没写几个小时到早上了我爸就到我这不停地唠叨……弄得我下一步要做什么都不知道了……说不定又要有疏漏……说不定会有什么致命的缺陷……将来有什么后果他又不承担……

lanche · 发表于 2004-8-3 00:05:53

我手头都有OpenGL编程的书，可就是时间和精力不够啊(要养家糊口的)。

还有比supertux更流畅的，过两天我放出来时你就会知道了，呵呵 ...

applepie · 发表于 2004-8-3 00:19:44

lanche +u 我就等着你这篇文章了
我也觉得自己写的代码效果没有别人的好我究竟少写了啥呢大概也是全屏刷新的原因奇怪中。。。。

		自动登录	找回密码
密码			注册