找到你要的答案

Q:Copy a struct with a string member in C

Q:复制结构与C字符串的成员

I have a simple struct containing a string defined as a char array. I thought that copying an instance of the struct to another instance using the assignment operator would simply copy the memory address stored in the char pointer. Instead it seems that the string content is copied. I put together a very simple example:

#include <stdio.h>
#include <string.h>

struct Test{
  char str[20];
};

int main(){

  struct Test t1, t2;
  strcpy(t1.str, "Hello");
  strcpy(t2.str, "world");
  printf("t1: %s %p\n", t1.str, (char*)(t1.str));
  printf("t2: %s %p\n", t2.str, (char*)(t2.str));
  t2 = t1;
  printf("t2: %s %p\n", t2.str, (char*)(t2.str));
  return 0;
}

Compiling this code with gcc 4.9.2 I get:

t1: Hello 0x7fffb8fc9df0
t2: world 0x7fffb8fc9dd0
t2: Hello 0x7fffb8fc9dd0

As I understand, after t2 = t1 t2.str points to the same memory address it pointed before the assignment, but now inside that address there is the same string found inside t1.str. So it seems to me that the string content has been automatically copied from one memory location to another, something that I thought C would not do. I think that this behaviour is triggered by the fact that I declared str as a char[], not as a char*. Indeed, trying to assign directly one string to another with t2.str = t1.str gives this error:

Test.c: In function ‘main’:
Test.c:17:10: error: assignment to expression with array type
   t2.str = t1.str;
      ^

which makes me think that arrays are effectively treated differently than pointers in some cases. Still I can't figure out which are the rules for array assignment, or in other words why arrays inside a struct are copied when I copy one struct into another one but I can't directly copy one array into another one.

我有一个简单的结构包含一个定义为一个字符数组的字符串。我认为复制结构实例的另一个实例使用赋值运算符将副本存储在char指针的内存地址。相反,似乎字符串内容被复制。我把一个非常简单的例子:

#include <stdio.h>
#include <string.h>

struct Test{
  char str[20];
};

int main(){

  struct Test t1, t2;
  strcpy(t1.str, "Hello");
  strcpy(t2.str, "world");
  printf("t1: %s %p\n", t1.str, (char*)(t1.str));
  printf("t2: %s %p\n", t2.str, (char*)(t2.str));
  t2 = t1;
  printf("t2: %s %p\n", t2.str, (char*)(t2.str));
  return 0;
}

编写此代码GCC 4.9.2我得到:

t1: Hello 0x7fffb8fc9df0
t2: world 0x7fffb8fc9dd0
t2: Hello 0x7fffb8fc9dd0

按我的理解,在T2 = T1 t2.str指向相同的内存地址指出之前的分配,但现在里面的地址有相同的字符串里面t1.str。所以在我看来,字符串的内容已经自动复制从一个内存位置到另一个,一件我认为C不会做。我认为这种行为是事实,我宣布STR作为字符[ ]触发,而不是作为一个char *。事实上,试图将一个字符串,直接到另一个t2.str = t1.str给这个错误:

Test.c: In function ‘main’:
Test.c:17:10: error: assignment to expression with array type
   t2.str = t1.str;
      ^

这使我认为数组在某些情况下与指针的处理方式不同。我仍然不明白这是数组赋值规则,或者换句话说,为什么在结构数组复制当我复制一个struct变成另一个我不能直接复制到另一个数组。

answer1: 回答1:

The structure contains no pointer, but 20 chars. After t2 = t1, the 20 chars of t1 are copied into t2.

The structure contains no pointer, but 20 chars. After t2 = t1, the 20 chars of t1 are copied into t2.

answer2: 回答2:

There are really 20 characters in your case, it same as if you declare the struct as struct Test {char c1, char c2, ...}

If you want to copy only pointer to the string, you can change the struct declaration as below and manually manage the memory for the string via functions Test_init and Test_delete.

struct Test{
  char* str;
};

void Test_init(struct Test* test, size_t len) {
  test->str = malloc(len);
}

void Test_delete(struct Test* test) {
  free(test->str);
}

真的有你的20个字,就如你一样的声明结构为struct { char字符测试C1,C2,…}

如果你想复制的字符串指针,你可以改变结构声明如下,手动管理内存的功能,通过test_init test_delete字符串。

struct Test{
  char* str;
};

void Test_init(struct Test* test, size_t len) {
  test->str = malloc(len);
}

void Test_delete(struct Test* test) {
  free(test->str);
}
answer3: 回答3:

If you run the following simple program

#include <stdio.h>

int main( void )
{
    {
        struct Test
        {
            char str[20];
        };
        printf( "%zu\n", sizeof( Test ) );
    }

    {
        struct Test
        {
            char *str;
        };
        printf( "%zu\n", sizeof( Test ) );
    }
    return 0;
}

you will get a result similar to the following

20
4

So the first structure contains a character array of 20 elements while the second structure contains only a pointer of type char *.

When one structure is assigned to another structure its data members are copied. So for the first structure all content of the array is copied in another structure. For the second structure only the value of the pointer (the address it contains) is copied. The memory pointed to by the pointer is not copied because it is not contained in the structure itself.

And arrays are not pointers though usually names of arrays in expressions (with rare exceptions) are converted to pointers to their first elements.

如果你运行下面的简单程序

#include <stdio.h>

int main( void )
{
    {
        struct Test
        {
            char str[20];
        };
        printf( "%zu\n", sizeof( Test ) );
    }

    {
        struct Test
        {
            char *str;
        };
        printf( "%zu\n", sizeof( Test ) );
    }
    return 0;
}

你会得到类似的结果如下

20
4

所以第一个结构包含20个元素的字符数组,而第二个结构只包含一个指针类型。

当一个结构被分配给另一个结构时,它的数据成员被复制。因此,对于第一个结构,数组的所有内容都在另一个结构中复制。对于第二个结构,只复制指针(它包含的地址)的值。指针所指向的内存不被复制,因为它不包含在结构本身中。

数组不是指针,但通常表达式中数组的名称(很少有例外)被转换为指向它们第一个元素的指针。

answer4: 回答4:

In C a struct is a way for the compiler to know how to structure an area of memory. A struct is a kind of template or stencil which the C compiler uses to figure out how to calculate offsets to the various members of the struct.

The first C compilers did not allow struct assignment so people had to use a memcpy() function to assign structs however later compilers did. A C compiler will do a struct assignment by copying the number of bytes of the struct area of memory, including padding bytes that may be added for address alighnment from one address to another. Whatever happens to be in the source memory area is copied to the destination area. There is nothing smart done about the copy. It is just copy so many bytes of data from one memory location to another.

If you have a string array in the struct or any kind of an array then the entire array will be copied since that is part of the struct.

If the struct contains pointer variables then those pointer variables will also be copied from one area to another. The result of this is that you will have two structs with the same data. The pointer variables in each of those structs will have similar address values, the two areas being a copy of each other, so a particular pointer in one struct will have the same address as the corresponding pointer in the other struct and both will be pointing to the same location.

Remember that a struct assignment is just copying bytes of data from one area of memory to another. For instance if we have a simple struct with a char array with the C source looking like:

typedef struct {
    char tt[50];
} tt_struct;

void test (tt_struct *p)
{
    tt_struct jj = *p;

    tt_struct kk;

    kk = jj;
}

The assembler listing output by the Visual Studio 2005 C++ compiler in debug mode for the assignment of kk = jj; looks like:

; 10   :    tt_struct kk;
; 11   : 
; 12   :    kk = jj;

  00037 b9 0c 00 00 00   mov     ecx, 12            ; 0000000cH
  0003c 8d 75 c4     lea     esi, DWORD PTR _jj$[ebp]
  0003f 8d 7d 88     lea     edi, DWORD PTR _kk$[ebp]
  00042 f3 a5        rep movsd
  00044 66 a5        movsw

This bit of code is copying data 4 byte word by 4 byte word from one location in memory to another. With a smaller char array size, the compiler may opt to use a different series of instructions to copy the memory as being more efficient.

In C arrays are not really handled in a smart way. An array is not seen as a data structure in the same way that Java sees an array. In Java an array is a type of object composed of an array of objects. In C an array is just a memory area and the array name is actually treated like a constant pointer or a pointer that can not be changed. The result is that in C you can have an array say int myInts[5]; which Java would see as an array of five ints however to C that is really a constant pointer with a label of myInts. In Java if you try to access an array element out of range, say myInts[i] where i is a value of 8, you will get a runtime error. In C if you try to access an array element out of range, say myInts[i] where i is a value of 8, you will not get a runtime error unless you are working with a debug build with a nice C compiler that is doing runtime checks. However experienced C programmers have a tendency to treat arrays and pointers as similar constructs though arrays as pointers do have some restrictions since they are a form of a constant pointer and aren't exactly pointers but have some characteristics similar to pointers.

This kind of buffer overflow error is very easy to do in C by accessing an array past its number of elements. The classic example is doing a string copy of a char array into another char array and the source char array does not have a zero termination character in it resulting in a string copy of a few hundred bytes when you expect ten or fifteen.

在C中结构让编译器知道如何构造一个内存区域。结构是一种模板或模板的C编译器使用如何计算补偿结构中的各个成员。

第一个C编译器不允许struct赋值所以人们不得不使用memcpy()功能分配结构后来编译器做。C编译器将通过复制的内存字节数做结构地区结构分配,包括填充字节可以添加地址配向从一个到另一个地址。源内存区中发生的任何事件都会复制到目标区域。没有什么聪明的副本。它只是复制这么多字节的数据从一个内存位置到另一个。

如果你有一个字符串数组的结构或任何一个阵列,整个阵列将复制自认为是这个结构的一部分。

如果结构包含指针变量的指针变量,那么也将从一个区域复制到另一个。这样做的结果是,你将有两个结构相同的数据。在那些结构指针变量也会有类似的地址值,这两个领域被复制对方,所以在一个结构特殊的指针也会有相同的地址在其他结构相应的指针都指向同一个位置。

记得一个struct赋值只是复制的字节从存储器的一个区域到另一个数据。例如,如果我们有一个C源像char数组结构简单:

typedef struct {
    char tt[50];
} tt_struct;

void test (tt_struct *p)
{
    tt_struct jj = *p;

    tt_struct kk;

    kk = jj;
}

汇编清单输出由Visual Studio 2005的C++编译器调试KK = JJ的分配模式;看起来像:

; 10   :    tt_struct kk;
; 11   : 
; 12   :    kk = jj;

  00037 b9 0c 00 00 00   mov     ecx, 12            ; 0000000cH
  0003c 8d 75 c4     lea     esi, DWORD PTR _jj$[ebp]
  0003f 8d 7d 88     lea     edi, DWORD PTR _kk$[ebp]
  00042 f3 a5        rep movsd
  00044 66 a5        movsw

这个代码是从内存中的一个位置复制数据4字节的单词,由4个字节的单词到另一个位置。使用较小的char数组大小,编译器可能会选择使用不同的指令系列来复制内存,以便更高效。

在C数组中没有真正处理的智能方式。数组不视为java数组,看到同样的数据结构。java中的数组是一种由一个对象数组对象。在数组中,数组只是一个内存区域,数组名称实际上被当作常量指针或不可更改的指针处理。结果是,C可以说int数组myints [ 5 ];这将为java五个整型但是C真的是一个恒定的一个标签myints指针数组。在java如果你试图访问数组元素的范围,说myints [我],我是一个8的值,你会得到一个运行时错误。C如果你试图访问数组元素的范围,说myints [我],我是一个8的值,你不会除非你正在调试版本中的一个很好的C编译器,在运行时检查工作得到一个运行时错误。然而经验丰富的C程序员有一种倾向对待数组和指针类似的结构虽然数组作为指针有一定的限制,因为它们是一个常量指针的形式,并不完全指针,但有一些类似的特点指针。

这种缓冲区溢出错误很容易通过访问数组中的元素数来实现。典型的例子是,将char数组的字符串复制到另一个char数组中,而源char数组没有一零个终止字符,当你期望为十或。

c  arrays  string  struct