Pointer arithmetic in C#

aggle-rithm · Oct 23, 2006

I am working on a file parsing system in C#, and I want to share an array of bytes between objects, with each object only looking at a subset of the array. There is some overlap, as it breaks the file down into hierarchical structures, some of which contain subsets of the data within "child" objects, as well as the complete set.

In C++, this was fairly simple to do, since all I had to do was add the offset to the original pointer, and there was my new byte array. In C# it's more difficult because the array is an encapsulated object.

I've already started to work with one solution that wraps the array object in a class that allows subsets to be easily referred to, but I'm wondering if I'm missing an easier solution.

Stimpson J. Cat · Oct 23, 2006

I'm not so familiar with C#, but I think you should be able to use iterators to your array class in pretty much the same way you would use pointers in C++.

Dr. Stupid

JamesM · Oct 23, 2006

So you want to be able to pass 'slices' of the array around, such that any modification to the slice also affects the whole array?

Don't fight the object-based approach. That will be the 'easier' solution in C#. It's easier than faking pointer arithmetic, at any rate.

a_unique_person · Oct 24, 2006

In Memory tables are great. You can use SQL queries to look at the subset you want.

aggle-rithm · Oct 24, 2006

JamesM said:
So you want to be able to pass 'slices' of the array around, such that any modification to the slice also affects the whole array?

Don't fight the object-based approach. That will be the 'easier' solution in C#. It's easier than faking pointer arithmetic, at any rate.

The original purpose of this system, although I might add to it in the future, is to go into an Excel file and grab just the worksheet names. Excel files have a logical format (BIFF) built onto a physical format (compound document file format), so that the logical slices can be subsets of the physical slices. For the sake of data integrity, I really didn't want to split the same data up into multiple buffers. However, memory is cheap, so I may end up taking that approach after all.

Thanks for everyone's input.

xenxabar · Oct 25, 2006

Unless performance is critical, you can easily get names of the Worksheets in a Workbook by doing one of the following:

1. Add a reference to the Excel object library in C# and simply loop through each worksheet in each file (workbook) that you want.
2. Do basically the same as in option 1 but use VBA in Excel instead.

Wowbagger · Oct 25, 2006

JamesM said:
Don't fight the object-based approach. That will be the 'easier' solution in C#. It's easier than faking pointer arithmetic, at any rate.

I would agree, here. You don't need to use pointers in C#, except in the most special of cases. (I wish I could remember some good examples of such cases.)

a_unique_person said:
In Memory tables are great. You can use SQL queries to look at the subset you want.

Yeah, but performance is a drag. Cost/Benefit is not worthwhile vs. arrays of files and stuff, for this purpose.
That is an idea worth considering if you have lots of data, and are performing relatively complex analysis of it- you can let SQL do most of the hard work.

Grimoire · Oct 25, 2006

xenxabar said:
Unless performance is critical, you can easily get names of the Worksheets in a Workbook by doing one of the following:

1. Add a reference to the Excel object library in C# and simply loop through each worksheet in each file (workbook) that you want.

I agree 100%. I've used this technique before, and it works quite well, even if the documentation can be somewhat hard to find...

xenxabar · Oct 25, 2006

Grimoire said:
I agree 100%. I've used this technique before, and it works quite well, even if the documentation can be somewhat hard to find...

Here's an article for how to open Excel in C#:
http://www.codeproject.com/csharp/csharp_excel.asp

SpeederA · Oct 25, 2006

Use interfaces and polymorphism

aggle-rithm · Oct 25, 2006

xenxabar said:
Unless performance is critical, you can easily get names of the Worksheets in a Workbook by doing one of the following:

1. Add a reference to the Excel object library in C# and simply loop through each worksheet in each file (workbook) that you want.
2. Do basically the same as in option 1 but use VBA in Excel instead.

That brings me to my original reason for doing all this: Excel, used as an object, is extremely unstable in multi-threaded applications. Even if it's a multi-threaded application in which the instance of Excel is used ONLY IN THE MAIN THREAD, it will crash consistently.

Unfortunately, we have engineers where I work that insist on using Excel as a database system, even though they have Access and SQL Server. The worksheets are enormously complex and there are THOUSANDS of them that have to be accessed each day. Even in a single-threaded application, there are numerous times when the Excel object locks up and another needs to be started. We end up with fifty instances of Excel running in the background.

Accessing them through OLEDB is much more stable, but you have to know the sheet names, and we can't count on people always observing the naming conventions. Hence, the sheet-name-reading code.

SpeederA · Oct 25, 2006

Just create an "ExcelFileInfo" class which stores the byte array and then have nested classes (or structures if the info contains no reference types and will not be boxed) which deal with the breakdown of the array into smaller units.

That, or one of the billion other reasonable ways of compartmentalizing the array. Just do it in a way that completely hides the inner data structure from the rest of the proggy.

You're life will be much easier in the long run.

If you feel that you will be instantiating and destroying too many objects because of the numbers of calls, you can always use resurrection and an object pool to lower the load on the garbage collector.

Do NOT access excel worksheets from a multi-threaded application using COM as, not only are excel objects designed to be run in STA mode, they require specific user rights and the ability to interact with the desktop. Microsoft does not support this in any way, shape, or form in the current versions of office..... =o/

a_unique_person · Oct 26, 2006

aggle-rithm said:
That brings me to my original reason for doing all this: Excel, used as an object, is extremely unstable in multi-threaded applications. Even if it's a multi-threaded application in which the instance of Excel is used ONLY IN THE MAIN THREAD, it will crash consistently.

Unfortunately, we have engineers where I work that insist on using Excel as a database system, even though they have Access and SQL Server. The worksheets are enormously complex and there are THOUSANDS of them that have to be accessed each day. Even in a single-threaded application, there are numerous times when the Excel object locks up and another needs to be started. We end up with fifty instances of Excel running in the background.

Accessing them through OLEDB is much more stable, but you have to know the sheet names, and we can't count on people always observing the naming conventions. Hence, the sheet-name-reading code.

Using Excel as a database is the work of the devil. They can't do something stupid, and expect you to unstupid it.

Grimoire · Oct 26, 2006

a_unique_person said:
Using Excel as a database is the work of the devil. They can't do something stupid, and expect you to unstupid it.

I'm totally going to steal that line...

I am going to have to disagree with you though. Of course they can do something stupid and expect someone else to unstupid it. They shouldn't, but they can, and frequently will. As a developer, I feel it is my job to smack them upside the head, and then give them a better solution. People won't respond to arguments the are basically "but you are doing it wrong".

Present to them why something is wrong, the problems found when doing it wrong, what should be done to correct it, how it can be implemented, and what the advantages are to doing it correctly. If they see the light and decide to fix it, perfect. If not, start looking for another job, because a company that willfully continues to do something the wrong way will only end up costing themselves time and money, and stress the hell out of you...

Rob Lister · Oct 26, 2006

Grimoire said:
I'm totally going to steal that line...

I am going to have to disagree with you though. Of course they can do something stupid and expect someone else to unstupid it. They shouldn't, but they can, and frequently will. As a developer, I feel it is my job to smack them upside the head, and then give them a better solution. People won't respond to arguments the are basically "but you are doing it wrong".

Present to them why something is wrong, the problems found when doing it wrong, what should be done to correct it, how it can be implemented, and what the advantages are to doing it correctly. If they see the light and decide to fix it, perfect. If not, start looking for another job, because a company that willfully continues to do something the wrong way will only end up costing themselves time and money, and stress the hell out of you...

And I intend to steal the above two paragraphs from you!

SpeederA · Oct 26, 2006

a_unique_person said:
Using Excel as a database is the work of the devil. They can't do something stupid, and expect you to unstupid it.

Apparently you've never been a consultant.

Grimoire · Oct 26, 2006

Rob Lister said:
And I intend to steal the above two paragraphs from you!

Its ok, I've released them under GPL...

69dodge · Oct 26, 2006

aggle-rithm said:
Accessing them through OLEDB is much more stable, but you have to know the sheet names, and we can't count on people always observing the naming conventions. Hence, the sheet-name-reading code.

http://www.codeproject.com/aspnet/getsheetnames.asp ?

aggle-rithm · Oct 26, 2006

69dodge said:
http://www.codeproject.com/aspnet/getsheetnames.asp ?

This was just what a needed a week ago.

Thanks for shutting that barn door, but the horse is history...

aerosolben · Oct 27, 2006

Wowbagger said:
I would agree, here. You don't need to use pointers in C#, except in the most special of cases. (I wish I could remember some good examples of such cases.)

COM.

You're welcome.

Pointer arithmetic in C#

Ardent Formulist

Graduate Poster

Graduate Poster

Director of Hatcheries and Conditioning

Ardent Formulist

bacon chocolate tacos

The Infinitely Prolonged

Critical Thinker

bacon chocolate tacos

Critical Thinker

Ardent Formulist

Critical Thinker

Director of Hatcheries and Conditioning

Critical Thinker

Unregistered

Critical Thinker

Critical Thinker

Illuminator

Ardent Formulist

Evil Genius